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diagnostic data directly from memory associated with the 
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with specific customer line connections. Accordingly, real 
time monitoring of digital and analog line conditions and 
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METHODS, SYSTEMS AND COMPUTER though most of the PSTN is digital, V.34 modems treat the 

PROGRAM PRODUCTS FOR MONITORING network as if it were entirely analog. Moreover, the V.34 

PERFORMANCE OF A MODEM DURING A recommendation assumes that both ends of the commum- 

CONNECTION cation session suffer impairment due to quantization noise 

5 introduced by analog-to-digital converters. That is, the ana- 
log signals transmitted from the V.34 modems are sampled 
FIELD OF THE INVENTION at 8000 times per second by a codec upon reaching the 

PSTN with each sample being represented or quantized by 
The present invention relates generally to the field of an e ight-bit pulse code modulation (PCM) codeword. The 
modems, and, more particularly, to modem diagnostics. co d ec uses 256, non-uniformly spaced, PCM quantization 

BACKGROUND OF THE INVENTION 1 <° ^ "* ^ " ^ ™ 
The demand for remote access to information sources and Because the analog waveforms are continuous and the 
data retrieval, as evidenced by the success of services such binary PCM codewords are discrete, the digits that are sent 
as the World Wide Web, is a driving force for high-speed J5 across the PSTN can only approximate the original analog 
network access technologies. Today's telephone network waveform. The difference between the original analog wave- 
offers standard voice services over a 4 kHz bandwidth. form and the reconstructed quantized waveform is called 
Traditional analog modem standards generally assume that quantization noise, which limits the modem data rate, 
both ends of a modem communication session have an While quantization noise may limit a V.34 communication 
analog connection to the public switched telephone network 2Q sess i on to 33.6 kbps, it nevertheless affects only analog-to- 
(PSTN). Because data signals are typically converted from digital conversions. The V.90 standard relies on the lack of 
digital to analog when transmitted towards the PSTN and analog-to-digital conversions outside of the conversion 
then from analog to digital when received from the PSTN, ma d e at the subscriber's modem to enable transmission at 56 
data rates may be limited to 33.6 kbps as defined in the V.34 j^ps. 

transmission recommendation developed by the Interna- 2J The general environment for which the V.90 standard was 

tional Telecommunications Union (ITU). developed is depicted in FIG. 1. An Internet Service Pro- 

The need for an analog modem can be eliminated, vider (ISP) 22 is connected to a subscriber's computer 24 via 
however, by using the basic rate interface (BRI) of the a v.90 digital server modem 26, through the PSTN 28 via 
Integrated Services Digital Network (ISDN). A BRI offers digital trunks (e.g., Tl, El, or ISDN Primary Rate Interface 
end-to-end digital connectivity at an aggregate data rate of 30 (pri) connections), through a central office switch 32, and 
160 kbps, which is comprised of two 64 kbps B channels, a finally through an analog loop to the client's modem 34. The 
16 kbps D channel, and a separate maintenance channel. The central office switch 32 is drawn outside of the PSTN 28 to 
ISDN offers suitable data rates for Internet access, better illustrate the connection of the subscriber's computer 
telecommuting, remote education services, and some forms 24 and modem 34 into the PSTN 28. It should be understood 
of video conferencing. ISDN deployment, however, has 3S that the central office 32 is, in fact, a part of the PSTN 28. 
been very slow due to the substantial investment required of j ne operation of a communication session between the 
network providers for new equipment. Because the ISDN is subscriber 24 and an ISP 22 is best described with reference 
not very pervasive in the PSTN, the network providers have to the more detailed block diagram of FIG. 2. 
typically tarriffed ISDN services at relatively high rates, Transmission from the server modem 26 to the client 
which may be ultimately passed on to the ISDN subscribers. w moc Jem 34 will be described first. The information to be 
In addition to the high service costs, subscribers must transmitted is first encoded using only the 256 PCM code- 
generally purchase or lease network termination equipment WO rds used by the digital switching and transmission equip- 
to access the ISDN. meD t in the PSTN 28. The PCM codewords are modulated 

While most subscribers do not enjoy end-to-end digital using a technique known as pulse amplitude modulation 

connectivity through the PSTN, the PSTN is nevertheless 45 (PAM) in which discrete analog voltage levels are used to 

mostly digital. Typically, the only analog portion of the represent each of the 256 PCM codewords. These PAM 

PSTN is the phone line or local loop that connects a signals are transmitted towards the PSTN by the PAM 

subscriber or client modem (e.g., an individual subscriber in transmitter 36 where they are received by a network codec, 

a home, office, or hotel) to the telephone company's central N 0 information is lost in converting the PAM signals back 

office (CO). In recent years, local telephone companies have 50 to PCM because the codec is designed to interpret the 

been replacing portions of their original analog networks various voltage levels as corresponding to particular PCM 

with digital switching equipment. Nevertheless, the connec- codewords without sampling the PAM signals. The PCM 

tion between the home and the CO has been the slowest to data is then transmitted through the PSTN 28 until reaching 

change to digital as discussed in the foregoing with respect the central office 32 to which the client modem 34 is 

to ISDN BRI service. A recent data transmission recommen- 55 connected. Before transmitting the PCM data to the client 

dation issued by the ITU, known as V.90, takes advantage of modem 34, the data is converted from its current form as 

the digital conversions that have been made in the PSTN. By either fi-h-w or A-law companded PCM codewords to PAM 

viewing the PSTN as a digital network, V.90 technology is voltages by the codec expander (digital-to-analog (D/A) 

able to accelerate data downstream from the Internet or other converter) 38. These PAM voltages are processed by a 

information source to a subscriber's computer at data rates 60 central office hybrid 42 where the unidirectional signal 

of up to 56 kbps, even when the subscriber is connected to received from the codec expander 38 is transmitted towards 

the PSTN via an analog local loop. the client modem 34 as part of a bidirectional signal. A 

To understand how the V.90 recommendation achieves second hybrid 44 at the subscriber's analog telephone con- 

this higher data rate, it may be helpful to briefly review the nection converts the bidirectional signal back into a pair of 

operation of V.34 analog modems. V.34 modems are opti- 65 unidirectional signals. Finally, the analog signal from the 

mized for the situation where both ends of a communication hybrid 44 is converted into digital PAM samples by an 

session are connected to the PSTN by analog lines. Even analog-to-digital (A/D) converter 46, which are received and 
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decoded by the PAM receiver 48. Note that for transmission 
to succeed effectively at 56 kbps, there must be only a single 
digital-to-analog conversion and subsequent analog-to- 
digital conversion between the server modem 26 and the 
client modem 34. Recall that analog-lo-digital conversions 
in the PSTN 28 can introduce quantization noise, which may 
limit the data rate as discussed hereinbefore. Moreover, the 
PAM receiver 48 needs to be in synchronization with the 8 
kHz network clock to properly decode the digital PAM 
samples. 

Transmission from the client modem 34 to the server 
modem 26 follows the V.34 data transmission standard. That 
is, the client modem 34 includes a V.34 transmitter 52 and 
a D/A converter 54 that encode and modulate the digital data 
to be sent using techniques such as quadrature amplitude 
modulation (QAM). The hybrid 44 converts the unidirec- 
tional signal from the digital-to-analog converter 54 into a 
bidirectional signal that is transmitted to the central office 
32. Once the signal is received at the central office 32, the 
central office hybrid 42 converts the bidirectional signal into 20 
a unidirectional signal that is provided to the central office 
codec. This unidirectional, analog signal is converted into 
either ,«-law or A-law companded PCM codewords by the 
codec compressor (A/D converter) 56, which are then trans- 
mitted through the PSTN 28 until reaching the server 25 
modem 26. The server modem 26 includes a conventional 
V.34 receiver 58 for demodulating and decoding the data 
sent by the V.34 transmitter 52 in the client modem 34. Thus, 
data is transferred from the client modem 34 to the server 
modem 26 at data rates of up to 33.6 kbps as provided for 30 
in the V.34 standard. 

The V.90 standard only offers increased data rates (e.g., 
data rates up to 56 kbps) in the downstream direction from 
a server to a subscriber or client. Upstream communication 
still takes place at conventional data rates as provided for in 35 
the V.34 standard. Nevertheless, this asymmetry is particu- 
larly well suited for Internet access. For example, when 
accessing the Internet, high bandwidth is most useful when 
downloading large text, video, and audio files to a subscrib- 
er's computer. Using V.90, these data transfers can be made 40 
at up to 56 kbps. On the other hand, traffic flow from the 
subscriber to an ISP consists of mainly keystroke and mouse 
commands, which are readily handled by the conventional 
rates provided by the V.34 standard. 

The V.90 standard, therefore, provides a framework for 45 
transmitting data at rates up to 56 kbps provided the network 
is capable of supporting the higher rates. The most notable 
requirement is that there can be at most one digital-to-analog 
conversion in the path from the server modem to the client 
modem. Nevertheless, other digital impairments, such as 50 
robbed bit signaling (RBS) and digital mapping through 
PADs, which results in attenuated signals, can also inhibit 
transmission at V.90 rates. Communication channels exhib- 
iting non-linear frequency response characteristics are yet 
another impediment to transmission at the V.90 rates. These . 55 
factors may limit conventional V.90 performance to less than 
the 56 kbps theoretical data rate. 

Articles such as Humblet et al., "The Information 
Driveway," IEEE Communications Magazine, December 



373, 381, provide general background information on digital 
communication systems. 

As modem performance specifications push ever closer to 
the limits supported by the physical media, the potential for 
5 user dissatisfaction with modem performance under various 
line conditions increases. For example, it is not uncommon 
for 56 k modem users to experience significantly lower 
connection speeds or throughput than they may expect in 
light of the 56 k capability associated with the modems. 
10 Users may not appreciate that, while manufacturers of such 
modems often quote 53.3 kbps as a maximum speed, they 
also typically note that line conditions may result in even 
lower speeds. Unfortunately, manufacturers currently have 
generally not effectively explained why a given customer's 
15 modem may be operating at a particular speed or throughput. 
Furthermore, manufacturers are not readily able to explain 
why a given customer may encounter an inability to make a 
connection or experience dropped connections. 

One known approach to evaluating modem performance 
20 is the use of AT commands, such as those provided for by 
operating systems, such as Windows™ from Microsoft 
Corporation, for communicating with a modem (such as the 
#UD command). However, only a limited amount of diag- 
nostic information may be obtained from a modem using 
25 this approach. Furthermore, the modem communication 
session typically must be terminated to obtain information 
using AT commands, which not only interrupts ongoing 
operations but further may limit the amount aDd types of 
data available from the modem, (for example, due to retrain- 
30 ing procedures overwriting various data within the modem). 
For example, an interface like Hyperterm™ may be used to 
enter AT commands which would preclude the use of such 
commands while the modem was in use by a user 
application, such as Dial-up Networking, as the Hyper- 
35 term™ application would establish any client modem to 
server modem connection. A further approach, as described 
in U.S. Pat. No. 5,634,022 entitled "Multi-Media Computer 
Diagnostic System" provides for insertion of branch instruc- 
tions into the operating code of a signal processing device 
40 transferring operations to a diagnostic program. However, 
this approach is directed to discovery and correction of 
algorithmic or logical faults and may not be well suited to 
line condition monitoring. Accordingly, a need exists to 
obtain knowledge of customer specific line conditions and 
potential interworking problems between a customer's client 
modem and a server modem to address customer support 
problems. 



SUMMARY OF TOE INVENTION 
1 object of the present invention to provide methods, 
and computer program products for monitoring 
of a modem which may be able to obtain data 



It is a further object of the present invention to provide 
such methods, systems and computer program products 
which may provide obtained performance data to a provider 
for use in addressing customer support problems. 
These and other objects, advantages, and features of the 



1996, pp. 64-68, Kalet et al., "The Capacity of PCM 60 present invention may be provided by methods, systems and 



1 Channels," IEEE International Conference 1 
Communications '93, May 23-26, 1993, Geneva, 
Switzerland, pp. 507-511, Fischer et al, "Signal Mapping 
for PCM Modems," V-pcm Rapporteur Meeting, Sunriver, 
Oreg., USA, Sep. 4-12, 1997, and Proakis, "Digital Signal- 
ing Over a Channel with Intersymbol Interference," Digital 
Communications, McGraw-Hill Book Company, 1983, pp. 



computer program products for monitoring perfon 
a modem which obtain diagnostic data directly from a 
memory associated with the modem's digital signal proces- 
sor (DSP). A secondary path to the DSP memory is utilized 
i for the monitoring operations so that real time data can be 
obtained during connection procedures and during an active 
connection. First -in first-out (FIFO) buffers are preferably 



US 6,823,004 Bl 



incorporated in the DSP memory to track state 
one or more of the state machines within the modem and 
various performance data measurements may be obtained 
directly from the DSP memory responsive to different state 
transition events. The real time collected data may be stored 
in a file and provided to a remote location for use in 
diagnosing customer problems with specific customer line 
connections. Accordingly, real time monitoring of digital 
and analog line conditions and modem performance may be 
utilized to diagnose problems with modems and line con- 
nections. 

In one embodiment of the present invention, a method is 
provided for monitoring performance of a modem including 
obtaining data related to the performance of the modem 



the DSP memory to obtain data related to performance of the 
modem during an active communication session supported 
by the primary path of the modem asynchronously with 
communication operations of the modem supporting the 
5 active communication session. In one embodiment, the 
modem further includes a first-in first-out (FIFO) buffer 
coupled to the DSP and the DSP is configured to place 
internal state information associated with state machines of 
the modem in the FIFO buffer. Furthermore, the modem may 
10 include a plurality of first-in first-out (FIFO) buffers, each of 
the FIFO buffers being associated with one of a plurality of 
state machines of the modem and the DSP may be config- 
ured to place internal state information associated with the 
plurality of state machines in a corresponding one of the 
is FIFO buffers. 



opera- 



from a memory of the modem during 
to the modem asynchronously with 
tions of the modem supporting the 
Preferably, the data related to the performance of the modem 
is obtained from a digital signal processor (DSP) memory 
using a secondary path to the DSP memory. The data may be 
obtained during startup of the active connection. The data 
further may be selected from the group consisting of line 
condition data and interworking data. 

In a further embodiment of the present invention, the data 
related to the performance of the modem includes internal 
state information and obtaining operations include determin- 
ing that a state transition has occurred based on the internal 
state information and capturing a selected type of data 
related to the performance of the modem responsive to a 
state transition. Furthermore, the modem may include a 
plurality of state machines each state machine having a 
plurality of associated states in which case determining 
operations may further include determining that a state 
transition of one of the plurality of state machines has 
occurred based on detecting a change from one of the 
plurality of associated stales of the one of the plurality of 
state machines to another of the plurality of associated states 
of the one of the plurality of state machines and the selected 
type of data may be selected based on the one of the plurality 
of state machines. The selected type of data may further be 
selected based on the one of the plurality of associated states 

and the another of the plurality of associated states. The presenl ^ nQW bg described mQre My 

In another embodiment of the present invention, a first-in hereinafter with reference to the accompanying drawings, in 
first-out (FIFO) buffer is provided in the DSP memory and 45 which prefcrred embodiments of the invention are shown, 
determining operations further include placing the internal This invention may, however, be embodied in different 
state information in the FIFO buffer. A plurality of FIFO forms and snou]d not be construed as limited to the embodi- 
buffers may be provided in the DSP memory, each of the ments set fortn herein R ame r, these embodiments are pro- 
plurality of FIFO buffers being associated with a different vided so that this d i sc i osur6 win be thorough and complete, 
state machine of the modem and internal state information JQ ar)d will fully convey the of the i nven tion to those 

associated with a respective state machine of the modem skilled in the art. Like reference numbers signify like 



As will further be appreciated by those of skill in the art, 
while described above primarily with reference to method 
aspects, the present invention may be embodied as methods, 
apparatus/systems and/or computer program products. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Other features of the present invention will be more 
readily understood from the following detailed description 
1 2S of specific embodiments thereof when read in conjunction 
' with the accompanying drawings, in which: 

FIG. 1 is block diagram illustrating a typical V.90 con- 
nection between a subscriber and an ISP in accordance with 
the prior art; 

) FIG. 2 is a detailed block diagram of the internal archi- 
tecture and connections between the client modem, the 
central office, and the server modem of FIG. 1; 

FIG. 3 is a block diagram of a modem performance 
monitoring system in accordance with the present invention; 

'and 

FIG. 4 is a flow chart illustrating operations for obtaining 
performance data according to an embodiment of the present 
invention. 



may be placed in a corresponding one of the plurality of 
FIFO buffers. 

In a further embodiment of the present invention, the date 
related to the performance of the modem is stored in a text 5 
file. The stored data may then be provided to a remote 
location for analysis. 

In another aspect of the present invention, a system is 
provided for monitoring performance of a modem. The 
system includes a host system having a host system bus. The e 
monitoring system further includes a modem. The modem 
includes a digital signal processor (DSP) and a memory 
coupled to the DSP over a DSP system bus. In addition, the 
modem also includes a primary path to the DSP memory that 
supports a communication connection. A bus interface e 
couples the host system bus to the DSP system bus, the bus 
interface being configured to allow the host system to access 



elements throughout the description of the figures. 

As will be appreciated by those skilled in the art, the 
present invention can be embodied as a method, a system, or 
a computer program product. Accordingly, the present 
invention can take the form of an entirely hardware 
embodiment, an entirely software (including firmware, resi- 
dent software, micro-code, etc.) embodiment, or an embodi- 
ment containing both software and hardware aspects. 
Furthermore, the present invention can take the form of a 
computer program product on a computer-usable or 
computer-readable storage medium having computer-usable 
program code means embodied in the medium for use by or 
in connection with an instruction execution system. In the 
context of this document, a computer-usable or computer- 
readable medium can be any means that can contain, store, 
communicate, propagate, or transport the program for use by 
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or in connection with the instruction execution system, 
apparatus, or device. 

The computer-usable or computer-readable medium can 
be, for example but not limited to, an electronic, magnetic, 
optical, electromagnetic, infrared, or semiconductor system, 
apparatus, device, or propagation medium. More specific 
examples (a nonexhaustive list) of the computer-readable 
medium would include the following: an electrical connec- 
tion having one or more wires, a portable computer diskette, 
a random access memory (RAM), a read-only memory : 
(ROM), an erasable programmable read-only memory 
(EPROM or Flash memory), an optical fiber, and a portable 
compact disc read-only memory (CDROM). Note that the 
computer-usable or computer-readable medium could even 
be paper or another suitable medium upon which the pro- 
gram is printed, as the program can be electronically 
captured, via, for instance, optical scanning of the paper or 
other medium, then compiled, interpreted or otherwise pro- 
cessed in a suitable manner if necessary, and then stored in 
a computer memory. 

Computer program code for carrying out operations of the 
present invention is typically written in a high level pro- 
gramming language such as C or C++. Nevertheless, some 
modules or routines may be written in assembly or machine 
language to optimize speed, memory usage, or layout of the 
software or firmware in memory. Assembly language is 
typically used to implement time-critical code segments. 

The present invention will now be further described with 
reference to the block diagram illustration of an embodiment 
of a system for monitoring performance of a modem of FIG. 
3. As shown in the embodiment of FIG. 3, the host system 
300 is connected to a modem 310, such as a V.90 protocol 
modem. While the modem 310 is illustrated in FIG. 3 as 
being separate from the host system 300, it is to be under- 
stood that, in the preferred embodiment of the present 
invention, the modem 310 is an internal modem device 
contained within the host system 300. An example of such 
a host system including a modem which may be modified for 
use according to the teachings of the present invention is the 
ACP modem available with the Think Pad™ Laptop com- 
puter available from International Business Machines Cor- 
poration (IBM). 

The host system 300 is coupled to the modem 310 through 
a primary path 315 which supports communication services 
utilizing the modem 310. More particularly, communica- 
tions from applications executed on the host system 300 are 
conveyed on the primary path 315 to the modem 310 for 
transmission through the port 320 which, in the illustrated 
embodiment, provides a connection to the Public Switched 
Telephone Network (PSTN). Similarly, communications 
from a remote device by a server modem (not shown) are 
received from the PSTN through port 320 and provided to a 
destination application executing on the host system 300 by 
the modem 310. The primary path 315 may be a serial link 
such as an RS-232 connection. It is further to be understood 
that the port 320 may connect to other networks including 
wireless networks or a broadband network. It is to be 
understood that when connected with a wireless network the 
modem 310 may be a wireless modem. Similarly, when 
connected with a broadband network, the modem 310 may 
be a cable modem, an Asymmetric Digital Subscriber Line 
(ADSL), a Symmetric Digital Subscriber Line (SDSL), a 
High Speed Digital Subscriber Line (HDSL) or a Very High 
Speed Digital Subscriber Line (VDSL). 

As further shown in the embodiment of FIG. 3, the 
modem 310 includes a digital signal processor (DSP) 340 



and an associated DSP memory 345. The DSP 340 is 
coupled to the DSP memory 345 over the DSP system bus 
350. As used herein, references to the DSP memory 345 
associated with the DSP 340 refer to the memory or memo- 
ries within the modem 310 which are utilized for data 
storage by the DSP 340 during communication operations of 
the modem 310 supporting an active connection. This 
memory may include a separate memory device coupled to 
the DSP 340 over the DSP bus 350 and may further include 

, memory which is contained within the circuit device of DSP 
340 which is nonetheless available over the DSP system bus 
350. There also may be a multitude of DSPs being monitored 
if more than over DSP is used to implement the modem as, 
for example, with some broadband and wireless modems. 

; The DSP memory 345 further includes one or more 
first-in first-out (FIFO) buffers 355, 360. The FIFO buffers 
355, 360 implemented in the DSP memory 345 are used to 
record state transitions made for one or more of the state 
machines of the modem 310 as will be described further later 

) herein. As used herein, the term "FIFO buffer" includes both 
circular buffers and other types of FIFO buffers. 

The host system 300 is further connected to the modem 
310 through the bus interface 325 as shown in the embodi- 
ment of FIG. 3. Asecondary path 335 is thereby provided for 

5 accessing the DSP memory 345 through the DSP system bus 
350 by the bus interface 325 as will be described further later 
herein. The primary path 315 is used to support a commu- 
nication connection by the modem 310 while the secondary 
path 335 through the bus interface 325 allows the host 

j system 300 to access the DSP memory 345 to obtain data 
related to performance of the modem 310 during an active 
communication session supported by the primary path 315 
to the modem 310. More particularly, the bus interface 325 
provides a connection from the host system bus 330 of the 

5 host system 300 to the DSP system bus 350 of the modem 
310. A secondary path 335 can also be provided through 
other means, for example, to provide for implementation of 
the systems and methods of the present invention where 
external modems are used to support the host system 300. 

3 For example, as will be understood by those of skill in the 
art, the host system bus 330 may be coupled to an external 
communication port, such as a serial port or a parallel port, 
which provides a separate cabled link to an external modem 
allowing a secondary path of communication and access to 

5 the external modem while the primary coupling to the 
external modem is used to support a communication con- 
nection. However, it is preferred that an embodiment such as 
that illustrated in FIG. 3 be utilized as a direct connection 
between the host bus 330 and the DSP system bus 350 

o utilizing the bus interface 325 is expected to provide supe- 
rior performance characteristics to those which would be 
expected using an external communicational link. 
Accordingly, in preferred embodiments of the present 
invention, modem performance is monitored by a host 

5 system 300 containing an internal modem 310. Nonetheless, 
the benefits of the present invention may also be obtained in 
various other embodiments including those in which the 
secondary path 335 does not return to the same host as the 
primary path 315. A second host may be co-located or 

o remote from the first host. In fact, a remote second host 
could be at a distant location monitoring a modem connec- 
tion through the secondary path 335. 

The DSP 340 is coupled to the FIFO buffers 355, 360 and 
is configured to place internal state information associated 

5 with state machines of the modem 310 in the FIFO buffers 
355, 360. Each of the FIFO buffers 355, 360 may be 
associated with a selected one of a plurality of state 
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machines of the modem 310 and the DSP 350 is configured tion are particularly directed to environments in which both 

to place internal state information associated with the plu- a primary path and a secondary path are available to the DSP 

rality of state machines in a corresponding one of the FIFO memory 345 to provide for monitoring operations to occur 

buffers 355, 360. Alternatively, a single FIFO buffer may be in real time while a communication connection is active 

provided for all state machines of the modem 310 and a state 5 through the modem. 

tag may be provided to identify the associated state machine As is evident from the types of information identified 

for each transition record in the FIFO buffer. Accordingly, above which may be monitored according to the present 

the host system 300 coupled through the bus interface 325 invention, a significant amount of performance information 

to the DSP system bus 350 and accessing the DSP memory can be tracked during a communication connection, for 

345 including the FIFO buffers 355, 360 provides a means 10 example, on a minute-by-minute basis or responsive to 

for obtaining data related to the performance of the modem detection of the occurrence of certain events. The momtor- 

310 from the DSP memory 345 during an active connection ing system of the present invention may be utilized to 

to the modem 310 asynchronously with communication monitor internal states of the modem 310 or state transitions 

operations of the modem 310 supporting the active connec- of one or more state machines implemented within the 

tion. As used herein, the term "asynchronously" in this 15 modem 310 and t0 selectively record specified parameters 

context means that memory access operations to read data out of the total set of parameters available within the DSP 

from the DSP memory 345 and the FIFO buffers 355, 360 to memory 345 during state conditions where the selected 

monitor performance occur independently from operations parameters are significant or of potential interest to a diag- 

of the active connection. For example, the memory reads in nostic user. Accordingly, the types of information obtained 



a preferred embodiment are performed by a data acquisition 20 ma y not onl y be 'rigged by a state transition but may 

module executing on the host system 300 independent of the further vary depending upon the state into which the respec- 

code executed by the DSP 340 that supports the active tive state machine of the modem 310 has transitioned, 

communication connection through the modem 310. More Information may be collected on a real time basis and 

particularly, an application, such as a Windows™ recorded during the life of a connection. Furthermore, 

application, may be executed on a host system 300, such as 25 information about disconnects may be gathered and through- 

an IBM Think Pad mobile computer, to record performance P ut for a connection can be estimated, 

data obtained according to the present invention. In addition, data may also be collected when a connection 

Accordingly, improved diagnostic capabilities may be pro- is being attempted, in other words, during the startup phases 

vided through the ability to access the DSP memory 345 before a connection is in use for data communication. Such 

while the modem 310 is in operation and without interfering 30 data during startup may be especially useful when diagnos- 

with or interrupting data flow over the primary path 315 to ing "failure to connect" problems. Furthermore, as perfor- 

thc modem 310. mance information may be collected on a real-time basis 

Performance information so obtained may include a vari- during a connection, pertinent data may be preserved which 

ety of information including the lime of day, phone number might otherwise be lost as a result of an event causing 

dialed, call setup return codes (CSR CODE) such as those 35 diagnostic data in the DSP memory 345 to be overwritten 

available on Microsoft Corporation's AT code #UD (for example, during retrains). The performance data may be 

(UniModem diagnostic command specification), multi- recorded while the user of the client modem 310 is actively 

phase startup procedure information such as phase 1 nego- connected to a remote server modem in a normal manner 

tiation parameters (pursuant, for example, in a V.90 modem, such as through a service provider end user application (e.g. 

to the V.8 protocol), phase 2 through 4 errors, transmit and 40 AOL, IGN Dialer and Windows Dial-up Networking) 

receive speeds, retrains, disconnect errors, phone line executing on the host system 300. Performance data may be 

characteristics, connecting modem brand information, obtained throughout the active connection operations 

power levels, and other protocol specific information. For including both the startup phases and during data commu- 

example, with reference to a V.90 protocol, protocol specific nication as well as the disconnect procedures. Accordingly, 

information may include digital discontinuity impairments, 45 as used herein, the term "active connection" is intended to 

equalizer coefficients, echo canceller coefficients, PADs and encompass not only the data communication portion of a 

RBS (Rob Bit Signaling) intervals, non-linear distortion connection but also the startup phases of operation imple- 

measurements and other information which will be known to mented through the modem 310 and the corresponding 

those of skill in the art and as further described in the disconnect phases. 

standards for modem communication, such as the V.90 50 A further advantage provided by the systems, methods 

standard from the ITU. This type of information may be and computer program products of the present invention is 

obtained and stored by the host system 300 and subsequently that the data collection techniques may be utilized to obtain 

used for activities, such as characterizing the telephone performance data which is specific for each user connection, 

network to which the port 320 is coupled, including indi- The state variables and other performance information col- 

vidual line conditions and other characteristics, which could 55 lected may be unique for a particular situation, in other 

be a source of complaints from modem users. The acquired words, state variables may be collected for each specific 

data may further be utilized to analyze the capability of a client modem to server modem pair for a given point to point 

particular user connection and/or to determine if a connec- connection. Accordingly, when user complaints are 

tion exhibits digital discontinuity which, for example, would received, state variables and other performance data may be 

typically force a V.90 modem to fall back to V.34 standard 60 provided for analysis of the conditions leading to the user 

operations. Accordingly, it is to be understood that, while complaint. 

operations will be described herein primarily with reference Utilization of the FIFO buffers 355, 360, such as in the 

to a V.90 standard, the present invention may be applied illustrated embodiment of FIG. 3, provides further advan- 

beneficially to a variety of different modem types, including tages in monitoring performance data. The use of FIFO 

both voice band and broadband communication modems, as 65 buffers 355, 360 may allow the performance monitoring 

will be known to those of skill in the art. It is further to be application executing on the host system 300 to capture 

understood, however, that the teachings of the present inven- every state transition for the various state machines within 



US 6,82 

11 

the modem 310 even when access latency in reading the 
DSP memory 345 is longer than the duration of some states. 
In addition, time critical transient data may also be placed in 
the FIFO buffers 355, 360 and further may be only condi- 
tionally stored based on specific state transitions within the 
modem 310. Furthermore, other performance parameter data 
which is not as transient may also be conditionally traced on 
specific state transitions within the modem 310 by reading 
the parameters directly from the DSP memory 345 without 
the use of the FIFO buffers 355, 360. Finally, additional 
FIFO buffers could be implemented within the DSP memory 
345 for use in maintaining highly transient performance data 
in addition to the FIFO buffers 355, 360 which are associ- 
ated with various state machines of the modem 310 and are 
used for tracking state transitions. 

A particular example of a type of performance data which 
may be obtained by utilization of the present invention is 
impairment data such as that specified in the TIA PN 3857 
draft 10 specification entitled "North American Telephone 
Network Transmission Model for Evaluating Analog Client 
to Digitally Connected Server Modems." Various impair- 
ments which may be wholly or fully monitored according to 
the teachings of the present invention in the context of a V.90 
modem are found in Table 2-Aof the TIATN 3857 draft 10 
document. Further, in Table 1 below, the various types of 
impairments from the draft 10 document and an embodiment 
of monitoring of the various impairments according to the 
present invention are provided: 
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combinations into the measured frequency response of the 
line probing sequence contained in phase 2 of V.90 modem 
startup protocol. This information may be useful, for 
example, in cases where a V.90 modem falls back to V.34 
5 operation due to digital discontinuity or an excessively long 
local loop, 

Operations according to an embodiment of the present 
invention will now be described with reference to the flow 
chart illustration of FIG. 4. Operations begin at block 400 

io with the initiation of a primary path connection by an 
application executing on the host system 300 to establish a 
communication connection through the modem 310 to a 
remote server modem. A secondary connection is also set up 
for monitoring performance of the modem 310 through the 

35 bus interface 325 (block 405). In addition, acquisition of 
state information is begun utilizing the FIFO buffers 355, 
360 to track the state of one or more state machines 
implemented within the modem 310 (block 410). The 
modem 310 typically includes a plurality of state machines 

20 each state machine having a plurality of associated states. 
The internal state information regarding the respective states 
is maintained by placement in the FIFO buffers 355, 360. In 
one embodiment, a respective one of FIFO buffers is asso- 
ciated with each of the different state machines of the 

25 modem 310. 

As illustrated embodiment of FIG. 4, data collection is, in 
part, based on timed data events. Detection of a timed data 
event (block 420) results in the obtaining of performance 



TABLE 1 



Type of Impairment 

Digital PAD Loss from Server to Client 

Robbed Bit Signalling (RBS) from Server 
to Client and before the Digital PAD 

Robbed Bit Signalling (RBS) from Server 
to Client and after the Digital PAD 

Transhybrid Loss 

IMD 2nd Order 

Round Trip Delay 
Analog PAD Loss 

Loop Noise 



Illustrative Monitoring 



digits of precision and whether it is single 
or tandemn 

all 6 intervals are identified so precise 
counts can be determined (DILdata also 
available) 

all 6 intervals are identified so precise 
counts can be determined (DIL data also 
available) 

precise echo canceller response is available 
to examine frequency dependent echo 
response 

line probing reveals intermod. Estimates of 
distortion at 900, 1200, 1800, and 2400 Hz 




the equalizer, in conjunction with analysis 
of the digital PAD and analog AGC value, 
reveals the combined effect of loop length 
and analog PAD 

in the absence of significant nonlinear 
distortion, the MSE provides a good 
estimate of the noise 



The various impairments described in Table 1 above may 55 
further be combined with information related to various 
cable loop lengths for the connection. A combined channel 
response may then be estimated for the remote central office 
transfer function (including digital to analog post filtering) 
and, the transfer function of the local loop from the central 
office. The transfer function for the front end of the client 60 
modem 310 may further be determined. These various 
transfer functions may be estimated by capturing the deci- 
sion feed back equalizer's coefficients from within the client 
modem 310 after equalizer training is completed. In 
addition, an approximation of these transfer function may be 65 
examined prior to equalizer training by examining a curve 
fitting error which fits various V.34 modulation and carrier 



data (block 425). For example, specified performance 
parameters, which may be obtained from the DSP memory 
345, may be repeatedly obtained and stored on a periodic 
basis such as on a minule-by-minute basis during an active 
connection. In addition, a state transition event may be 
detected (block 430) which also results in the acquisition 
and storage of selected performance data. More particularly, 
when a state transition event is detected (block 430) the data 
associated with the occurring event is identified (block 435). 
The identified categories of performance data are then 
obtained (block 440). As illustrated at block 445, the 
obtained performance data from either block 425 or block 
440 is then stored, preferably in a text file format (block 
445). 
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As described previously with reference to FIG. 3, the shown in Appendix A, the respective state machines of the 

obtained data related to the performance of the modem is modem include four separate state machines: transmit (TX), 

obtained from the DSP memory 345 during an active con- receive (RX), front end transmit (FT) and front end receive 

nection of the modem 310 using a secondary path 335 to the (FR). The various states for each of the respective state 

DSP memory 345. The obtained data may correspond to 5 machines correspond to operations in compliance with the 

startup phase operations or data communication portions of V.90 standard and will be understood by those of skill in the 

the active connection. The data related to the performance of art familiar with that standard. The first column indicates the 

the modem may be selected from line condition data, such time of occurrence of the state condition. The asterisks in the 

as current mean squared error, current number of error state machine columns indicate the state machine which 

events since the connection began, current data rates for iQ underwent a transition triggering the creation of the row 

transmit and receive and interworking data, such as remote entry. The final column does not correspond to a state 

retrain or rate renegotiation requests, remote modem brand machine but reflects a data value for the current mean 

identification, remote modem requests to go to error recov- squared error (MSE) at the output of the equalizer which the 

ery during startup, retrain or rate renegotiation procedures, exemplary protocol captures and records at each state tran- 

etc. Examples of such data in the context of a V.90 modem J5 sition. In addition, at various points in the state machine 

have been described more fully above with reference to the tracking text file of Appendix A, selected performance data 

discussion of FIG. 3. values are captured responsive to particular state machine 

The state transition event is detected at block 430 based state transitions. For example, at the time 13: 23: 44, 
on internal state information obtained from the FIFO buffers responsive to a transition of the RX state machine from a 
355, 360. For example, a state transition may be detected for 20 recvj state to a recv_JP state, various equalizer and echo 
one of the plurality of state machines of the modem 310 canceller related parameter values are captured. State tran- 
based on detecting a change from one of the plurality of sition triggered parameter capture is also reflected at limes 
associated states of the one of the plurality of state machines 13: 23: 47, 13:23: 49, 13:23:50, 13:23: 52 and 13: 23: 55. 
to another of the plurality of associated states of the one of The second text output file format is illustrated by the 
the plurality of state machines. Furthermore, operations for 25 exemplary embodiment in Appendix B. Appendix B pre- 
determining the selected data to obtain at block 435 respon- sponds to a text file which is a global summary of a 
sive to a state transition event may be based on the one of connection with details. As with the example of Appendix A, 
plurality of state machines which has undergone a state the terminology utilized in the example of Appendix B is 
transition and may further be based upon the starting and consistent with that denned by the V.90 standard and will be 
ending states of the respective state transition for that one of 30 understood by those of ordinary skill in the art. Prior to the 
the plurality of state machines. power levels section is a list of a number of V.8 (phase 1) 

In a preferred embodiment of the present invention, the bytes from both the local and the remote modem. The power 

performance data is stored at block 445 in two separate text levels are self explanatory. The global minute by minute 

output files. The first file is utilized for a global summary of summary includes LMCPS (last minute characters per 

the active connection with time interval (such as minute- 35 second) which shows the estimated average throughput of 

by-minute) snap shots detailing the value of the various the modem during the last minute including the effects of 

performance data obtained from the DSP memory 345. A retrains, rate renegotiations and line errors. SUMEE refers to 

second output file is preferably state machine focused and the running sum of line errors that have taken place since the 

details the condition of the various state machines of the connection began. The remaining parameters are generally 

modem 310 and further includes any critical state variables 40 related to a particular ACP modem, the ThinkPad™ ACP 

and associated other performance data which is selected for modem, and may or may not be parameters found in other 

capture based on particular state transitions. modem products. 

The performance data files from block 445 may further be The present invention has been described above with 
provided to a remote location for analysis (block 450). reference to the block diagram illustrations of FIG. 3. and 
Accordingly, a manufacturer or service provider may 45 the flowchart illustration of FIG. 4. It will be understood that 
remotely access diagnostic information suitable for analyz- each block of the flowchart illustrations and/or block 
ing problems and addressing user complaints. More diagrams, and combinations of blocks in the flowchart 
particularly, connection specific information may be pro- illustrations and/or block diagrams, can be implemented by 
vided which may be particularly useful in light of the computer program instructions. These program instructions 
differences between line connections for various users in 50 may be provided to a processor to produce a machine, such 
point to point connections thereby allowing the service that the instructions which execute on the processor create 
provider to deal with individual cases to provide responsive means for implementing the functions specified in the flow- 
support to individual users. It is further to be understood that chart or block diagram block or blocks. The computer 
the operations at blocks 420 through 450 may repeat program instructions may be executed by a processor to 
throughout the duration of a connection as additional timed 55 cause a series of operational steps to be performed by the 
data events or state transition events are encountered or until processor to produce a computer implemented process such 
monitoring operations are discontinued by a user. that the instructions which execute on the processor provide 

In order to aid those of skill in the art to further understand steps for implementing the functions specified in the flow- 
the present invention, exemplary performance monitoring chart or block diagram block or blocks, 
outputs in a two file format for an embodiment of the present 60 Accordingly, blocks of the block diagrams and/or flow- 
invention as shown in Appendix A and Appendix B will now chart illustrations support combinations of means for per- 
be described. Appendix A and Appendix B correspond forming the specified functions, combinations of steps for 
respectively to the two output text file format described for performing the specified functions and program instruction 
a preferred embodiment above. More particularly, Appendix means for performing the specified functions. It will also be 
A is an output file which is state machine driven and details 65 understood that each block of the block diagrams and/or 
the state machines of a modem and identifies critical state flowchart illustrations, and combinations of blocks in the 
variables that are captured based on state transitions. As block diagrams and/or flowchart illustrations, can be imple- 
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merited by special purpose hardware-based systems which 
perform the specified functions or steps, or combinations of 
special purpose hardware and computer instructions. 

It should also be noted that, in some alternative 
implementations, the functions noted in the blocks may 
occur out of the order noted in the figures. For example, two 
blocks shown in succession may in fact be executed sub- 
stantially concurrently or the blocks may sometimes be 
executed in the reverse order, depending upon the function- 
ality involved. 10 

While the present invention has been illustrated and 
described in detail in the drawings and foregoing 
description, it is understood that the embodiments shown are 
merely exemplary. Moreover, it is to be understood that 
many variations and modifications can be made to the 
embodiments described herein above without substantially 
departing from the principles of the present invention. All 
such variations and modifications are intended to be 
included herein within the scope of the present invention, as 
set forth in the following claims. 

We claim: 

1. A method for monitoring performance of a modem 
comprising the steps of: 

establishing an active communications connection utiliz- 2J 

ing the modem; 
obtaining data related to the performance of the modem, 

including internal state information, from a digital 

signal processor (DSP) memory of the modem using a 

secondary path to the DSP memory which is separate 30 

and independent from a primary path to said DSP 

memory utilized in support of the communications 

connection; 

determining that a state transition has occurred based on 
the internal state information; and capturing a selected 35 
type of data related to the performance of the modem 
responsive to a state transition. 

2. A method according to claim 1 wherein the obtaining 
step further comprises the step of obtaining data related to 
the performance of the modem from the DSP memory 40 
during startup of the active connection. 

3. A method according to claim 1 wherein the data related 
to the performance of the modem is selected from the group 
consisting of line condition data and interworking data. 

4. A method according to claim 1 wherein the modem 45 
includes a plurality of state machines each state machine 
having a plurality of associated states and wherein the 
determining step further comprises the step of determining 
that a state transition of one of the plurality of state machines 
has occurred based on detecting a change from one of the 50 
plurality of associated states of the one of the plurality of 
state machines to another of the plurality of associated states 
of the one of the plurality of state machines and wherein the 
selected type of data is selected based on the one of the 
plurality of state machines. 55 

5. A method according to claim 4 wherein the selected 
type of data is further selected based on the one of the 
plurality of associated states and the another of the plurality 
of associated states. 

6. A method according to claim 1 wherein the determining 60 group consisting 
step is preceded by the step of providing a first-in first-out data. 
(FIFO) buffer in the DSP memory and wherein the deter- 
mining step further comprises the step of placing the internal 
state information in the FIFO buffer. 

7. A method according to claim 6 wherein the providing 
step further comprises the step of providing a plurality of 
FIFO buffers in the DSP memory, each of the plurality of 



FIFO buffers being associated with a different state machine 
of the modem and wherein the placing step further com- 
prises placing internal state information associated with a 
respective state machine of the modem in a corresponding 
one of the plurality of FIFO buffers. 

8. A method according to claim 1 wherein the obtaining 
step is followed by the step of storing the data related to the 
performance of the modem in a text file. 

9. A method according to claim 8 wherein the storing step 
is followed by the step of providing the stored data to a 
remote location for analysis. 

10. A system for monitoring performance of a modem 
comprising: 

a host system having a host system bus; 
a modem, the modem including: 
a digital signal processor (DSP); 
a memory coupled to the DSP over a DSP system bus; 
a primary path to the DSP memory that supports a 

communication connection; 
a plurality of first-in first-out (FIFO) buffers coupled to 
the DSP, each of the FIFO buffers being associated 
with one of a plurality of state machines of the 



wherein the DSP is configured to place internal state 

information associated with the 
plurality of state machines in a corresponding one of 
the FIFO buffers; and 
a bus interface coupling the host system bus to the DSP 
system bus, the bus interface being configured to allow 
the host system to access the DSP memory to obtain 
data related to performance of the modem during an 
active communication session supported by the pri- 
mary path of the modem using a secondary path to the 
DSP memory which is separate and independent from 
the primary path to said DSP. 

11. A system for monitoring performance of a modem 
comprising: 

an interface coupled to the modem; 
means for obtaining data related to the performance of the 
modem, including internal state information, from a 
digital signal processor (DSP) memory of the modem 
during an active connection to the modem using a 
secondary path to the DSP memory which is separate 
and independent from a primary path to said DSP 
memory utilized in support of the communications 
connection, the secondary path including the interface 
coupled to the modem; 
means for determining that a state transition has occurred 

based on the internal state information; and 
means for capturing a selected type of data related to the 
performance of the modem responsive to a state tran- 
sition. 

12. A system according to claim 11 wherein the means for 
obtaining further comprises means for obtaining data related 
to the performance of the modem from the DSP memory 
during startup of the active connection. 

13. A system according to claim 11 wherein the data 
related to the performance of the modem is selected from the 

" line condition data and interworking 



14. A system according to claim 11 wherein the modem 
includes a plurality of state machines each state machine 
having a plurality of associated states and wherein the means 
for determining further comprises means for determining 
that a state transition of one of the plurality of state machines 
has occurred based on detecting a change from one of the 
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plurality of associated states of the one of the plurality of 
state machines to another of the plurality of associated states 
of the one of the plurality of state machines and wherein the 
selected type of data is selected based on the one of the 
plurality of state machines. 

15. A system according to claim 14 wherein the selected 
type of data is further selected based on the one of the 
plurality of associated states and the another of the plurality 
of associated states. 

16. A system according to claim 11 wherein the system 
further comprises a first-in first-out (FIFO) buffer in the DSP 
memory and wherein the means for determining further 
comprises means for placing the internal state information in 
the FIFO buffer. 

17. A system according to claim 16 wherein the means for 
providing further comprises means for providing a plurality 
of FIFO buffers in the DSP memory, each of the plurality of 
FIFO buffers being associated with a different state machine 
of the modem and wherein the means for placing further 
comprises means for placing internal state information asso- 
ciated with a respective state machine of the modem in a 
corresponding one of the plurality of FIFO buffers. 

18. A system according to claim 11 further comprising 
means for storing the date related to the performance of the 
modem in a text file. 

19. A system according to claim 18 further comprising 
means for providing the stored data to a remote location for 
analysis. 

20. A computer program product for monitoring perfor- 
mance of a modem, comprising: 

a computer readable storage medium having computer 
readable program code means embodied therein, the 
computer readable code means comprising: 

computer readable code which establishes an active com- 
munications connection utilizing the modem; 

computer readable code which obtains data related to the 
performance of the modem, including internal state 
information, from a digital signal processor (DSP) 
memory of the modem using a secondary path to the 
DSP memory which is separate and independent from 
a primary path to said DSP memory utilized in support 
of the communications connection; 

computer readable code which determines that a state 
transition has occurred based on the internal state 
information; and 

computer readable code which captures a selected type of 
data related to the performance of the modem respon- 
sive to a state transition. 
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21. A computer program product according to claim 20 
wherein the computer readable code which obtains further 
comprises computer readable code which obtains data 
related to the performance of the modem from the DSP 

5 memory during startup of the active connection. 

22. A computer program product according to claim 20 
wherein the data related to the performance of the modem is 
selected from the group consisting of line condition data and 
interworking data. 

10 23. A computer program product according to claim 20 
wherein the modem includes a plurality of state machines 
each state machine having a plurality of associated states 
and wherein the computer readable code which determines 

]S further comprises computer readable code which determines 
that a state transition of one of the plurality of state machines 
has occurred based on detecting a change from one of the 
plurality of associated states of the one of the plurality of 
state machines to another of the plurality of associated states 

20 of the one of the plurality of state machines and wherein the 
selected type of data is selected based on the one of the 
plurality of state machines. 

24. A computer program product according to claim 23 
wherein the selected type of data is further selected based on 

25 the one of the plurality of associated states and the another 
of the plurality of associated states. 

25. A computer program product according to claim 20 
wherein the computer readable code which determines fur- 
ther comprises computer readable code which places the 

30 internal state information in a first-in first-out (FIFO) buffer 
in the DSP memory. 

26. A computer program product according to claim 25 
wherein the computer readable code which provides further 
comprises computer readable code which provides a plural- 

35 ity of FIFO buffers in the DSP memory, each of the plurality 
of FIFO buffers being associated with a different state 
machine of the modem and wherein the computer readable 
code which places further comprises computer readable 
code which places internal state information associated with 

40 a respective state machine of the modem in a corresponding 
one of the plurality of FIFO buffers. 

27. A computer program product according to claim 20 
further comprising computer readable code which stores the 
data related to the performance of the modem in a text file. 

45 28. A computer program product according to claim 27 
further comprising computer readable code which provides 
the stored data to a remote location for analysis. 
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FIG. 20B 
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FIG. 22A 
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FIG. 22B 
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FIG. 23A 
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FIG. 23B 
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FIG. 24B 
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FIG. 25A 



670 



A CONTROL STATION PROVIDES A GRAPHICAL 
USER INTERFACE FOR ENABLING A USER TO 
SPECIFY AT LEAST ONE FILTER 



,671 



DISPLAY A MENU LISTING ITEMS 
REPRESENTING EVENT-GENERATING 
MACHINES, EVENT-GENERATING 




COMPONENTS, AND/OR CATEGORIES OF 
EVENTS WITHIN THE DATA PROCESSING 
SYSTEM 



RECEIVE A MENU ENTRY SELECTION SIGNAL 
INDICATIVE OF A USER INTERFACE SELECTION 
DEVICE SELECTING ONE OF THE ITEMS TO 
MONITOR 



REPEAT THE PREVIOUS STEP, AS NECESSARY, 
UNTIL ALL DESIRED ITEMS HAVE BEEN 
SELECTED 



ALTERNATIVELY, THE GRAPHICAL USER 
INTERFACE DISPLAYS A PREDEFINED LIST OF 
FILTERS FROM WHICH A USER CAN SPECIFY 
AT LEAST ONE FILTER 



U.S. Patent Oct. 15, 2002 Sheet 27 of 33 



US 6,467,052 Bl 



FIG. 25B 
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FIG. 26A 
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FIG. 27B 
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METHOD AND APPARATUS FOR having the application write to a log file what was going on 

ANALYZING PERFORMANCE OF DATA at different places in the network. Then all of the log files 

PROCESSING SYSTEM would need to be collected, merged, and sorted. The devel- 
oper would then have to sift through the data in a time- 

TECHNICAL FIELD 5 intensive fashion and attempt to determine the performance 

This invention relates generally to data processing and, problem, 

more particularly, to a method and apparatus for analyzing m 1X110115 deficiencies with the prior 

the performance of a data processing system. approach. 

One problem is that only instrumented code can be 

COPYRIGHT NOTICE/PERMISSION 10 analyzed. That means source code must be modified, 

» j. , . , . . j , . recompiled, and re-deployed. This is a serious issue with the 

A portion of the disclosure of this patent document -a a c * • a 

. . , , . , . , . , . r . , t , widespread use of operating system services and component 

ntams material which is suniect tn convnunt nrotechnn. . . . . r . °r . . .. 



s material which is subiect to copyright protection. , , , . . . , ,. .. TT . . 

r „ . , . J , . . V3 B , F r . ., technology m today s applications. Users are typically 

Die copynght owner has no objection to the facsimile g T £ ^ tem and third % art J 

reproduct.on by anyone of the patent document o, :th« , patent 15 nenlS; beca ^ se J do not have physical or legal 

disclosure as it appears m the U.S. Patent and Trademark ^ tQ ^ ^ code ^hen they do have access to The 

Office Patent file or record^ but otherwise reserves all source cod m &m ^ {Q instrument them 

copynght rights whatsoever. The following notice applies to effectivel becau / e m do not understand the mm])0Beat 

the software and data as described below and in the drawings h th t th H h 

hereto: Copyright © 1997-1999, Microsoft Corporation, All 2 o 7 I f • ? Z a-* ■ a a 

Rights Reserved Another problem is that the modifications to code made 

- " ' by developers in an attempt to analyze its performance 

BACKGROUND OF THE INVENTION themselves adversely impact the application's performance. 

Further, the development of a highly efficient mechanism for 

In the field of data processing it is a well known problem recording the application data is non-trivial. Typical imple- 

that software developers usually require a period of time to 25 mentat; i ons i nvo i ve writing data to disk. Even if the input/ 

identify and resolve functional and performance issues in the output (I/O) is buffered asynchronously, it can have an 

code they have written or integrated. There can be many adverse impact on the application being monitored (e.g. 

reasons for such issues, including the basic system and masking actual application I/O). 

software architecture; non-optimized and/or flawed coding; A problem is mat understanding control flow 

the choice of, utilization of, and contention for system *> dmh}g trans it ions is very har d. Typically, in a large distrib- 

resources; timing and synchronization; system loading; and uted applic ation, transitions to separate processes, or to 

so forth. processes running on separate machines, are common, and 

Particularly in the area of distributed computer networks, may happen simultaneously. Since events have to be mami- 

it can be extremely difficult for software developers to ally merged by the developer, it is typically hard to deter- 

observe and isolate undesirable system performance and 35 mine which suspension in one process corresponds to 

behavior. A distributed computer network is defined herein resumption in another. 

to mean, at a minimum, a data processing system that An additional problem is that frequently there are a large 

utilizes more than one software application simultaneously number of application areas that might need to be analyzed; 

or that comprises more than one processor. ^ however, not all of them may need to be analyzed at the 

For example, a single box or machine which is running same time. Developers who manually instrument their code 

two or more processes, such as a data base application and must incorporate a selection technology to enable different 

a spreadsheet application simultaneously, fulfills this defi- portions to be analyzed. Otherwise, the load of all of the 

nition. Also, a single article such as a hand-held computer instrumentation has a severe impact on the analysis. This 

may comprise more than one microprocessor and thus ^ also requires a complex mechanism for developers to 

fulfills the definition. specify which information to collect on which machine. 

More commonly, however, distributed computer networks Yet another problem is that for distributed applications, 

may comprise two or more physical boxes or machines, logs from multiple machines (and often multiple logs per 

often hundreds or even millions (in the case of the Internet). machine) must be merged and sorted. Without synchronized 

A software developer trying to monitor and analyze the 50 clocks, this task is very difficult. As well, if the log files are 

operation and behavior of such complex computer networks in different formats (which is likely if they are from different 

is faced with a very daunting task. developers or companies), then the data must be translated 

For example, a developer may be writing or have written into common formats, 

a server component that performs credit checks. This soft- The result of all the effort described in this section is a 

ware component is used in a larger application that performs 55 very long list of analysis data. Manually analyzing and 

order entry processing. There are several other server com- isolating performance problems from this amount of data is 

ponents in the system (such as inventory verification, order a very complex and difficult task. 

validation, etc.) some of which run on the same server and One further problem with known performance analysis of 

some which run on a separate server (where the inventory data processing systems is that very often such analysis 

database resides). To complicate matters, each component go provides opportunities for breaching the data security of 

could reside on a computer system in a different state or such systems. 

country. If the application is not performing or behaving There exists known performance monitoring software in 

well, the developer needs to figure out if there is a perfor- various forms. Among them is software known as PerfMon 

mance or behavioral problem and, if so, be able to determine software, which is commercially available from Microsoft 

exactly where the trouble spots are. 6S Corporation. PerfMon software is a utility which, among 

In the prior art the developer had to modify his or her other things, can provide an indication of the utilization of 

application, by writing trace statements in the code and the computer's central processor unit (CPU) and memory 
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unit. PerfMon software operates by sampling. That is, it 
tracks continuous data by monitoring a machine and looking 
at its behavior. It can track the free space on a disk, monitor 
network usage, and so on, but it cannot gather event-based 
information, such as what function was most recently 5 
started. 

There also exist known tools called profilers. These look 
at a single executing software application and try to under- 
stand its performance. They do this either by monitoring the 
program (in a similar way to PerfMon software), or else they 10 
hook into the program they are monitoring and generate 
"events" each time a program subcomponent (function) 
commences or completes. Profilers typically have a massive 
impact on the performance and behavior of an application, 
because they are intrusive, and they typically require special 15 
compiler support. Their data is so detailed that it is normally 
impractical to use them, particularly in a distributed com- 
puting environment such as the one described above. 

The Windows NT® PerfMon utility, commercially avail- 
able from Microsoft Corporation, provides an extensible 20 
architecture for the collection and display of arbitrary appli- 
cation and system counters and metrics. Windows NT pro- 
vides base counters for the system for the purpose of 
monitoring CPU and memory utilization. It also provides 
counters for networks, disks, devices, processes, and so 25 
forth. Most system objects export counters. Many applica- 
tions available from Microsoft Corporation (such as MTS 
and SQL Server) and other suppliers provide additional 
counters. 

Therefore, there is a substantial need to provide software 
developers with automated tools for efficiently analyzing the 
performance, function, and behavior of their applications. 

There is also a substantial need to provide such develop- 
ers with tools for analyzing the performance, function, and 35 
behavior of their applications, either while the applications 
are executing or post mortem, and without significantly 
affecting the performance or data security characteristics of 
the applications 

In addition, there is a substantial need, in a commercial 40 
environment, to provide Application Program Interfaces 
(APIs) to such tools. 

SUMMARY OF THE INVENTION 

The above-mentioned shortcomings, disadvantages and 45 
problems are addressed by the present invention, which will 
be understood by reading and studying the Detailed Descrip- 
tion of the Invention. However, a brief summary of the 
invention will first be provided. 

The present invention includes a number of different 
aspects for analyzing the performance of a data processing 
system. For the purposes of describing this invention, the 
term "performance" is intended to include within its mean- 
ing not only the operational performance, but also the 55 
function, structure, operation, and behavior of a data pro- 
cessing system. 

While the invention has utility in analyzing the perfor- 
mance of a software application that is executing on a 
distributed data processing system, its utility is not limited 6q 
to such, and it has utility in analyzing the performance of 
computer hardware, computer software of all types includ- 
ing data structures, and a wide spectrum of data processing 
systems comprising both computer hardware and computer 
software. 65 

Insofar as the overall architecture and operation of the 
present invention is concerned, each machine where a por- 



tion of a distributed software application executes has at 
least one local event concentrator (LEC). In addition, there 
is at least one in-process event creator (IEC) and at least one 
dynamic event creator (DEC) per machine. The function of 
an EEC is to monitor the executing process for particular 
situations that occur which the developer wants to be 
monitored and to create an "event" that can be captured and 
later analyzed. The function of a DEC is similar to that of an 
IEC, but it monitors some aspect of the system operation that 
the developer wants to be monitored on a periodic or time 
basis and creates an "event" that can also be captured and 
later analyzed. 

The developer can specify by means of a "filter" what to 
look for in the system under examination. This narrows the 
scope of the search to what is of interest to the developer and 
reduces the burden on the performance monitoring system. 

When the IEC and DEC create events, they send them to 
the LEC, which collects them and temporarily stores them, 
either until the developer requests them or a developer- 
defined condition or "trigger" occurs, whereupon the LEC 
sends the events to the developer's control station. The 
control station analyzes the events and visually displays the 
results of the analysis to the developer in a multi-windowed, 
time-synchronized display. 

In order to prevent the collection of information from 
adversely affecting the performance of the system, the IEC 
and DEC are only active when they are carrying out the 
developer's orders to monitor certain things. Otherwise they 
are dormant and do not affect the performance. When an IEC 
is activated and is monitoring process execution for particu- 
lar situations, it creates a stream of events during "normal" 
execution and sends them to the LEC. However, the LEC 
doesn't send them through the network to the developer's 
control station until they are needed. 

In another aspect of the invention, a data design structure 
allows two communicating entities to describe their inter- 
actions and inter-relationships despite knowing almost noth- 
ing about each other. The data design structure includes 
pre-defined event fields and custom fields, and it breaks up 
the application into a series of black boxes and maps out the 
entities of the network and their inter-relationships for 
displaying to the developer an animated model of the 
application as it is executing, either in real time or "post 
mortem". 

In another aspect, the invention provides for user-defined 
triggers which cause the performance analysis software to 
passively buffer events until a malfunction occurs, then 
dump the buffered data and analyze it. This allows low- 
impact monitoring, since no information is stored until 
something of interest happens. 

In another aspect, the invention comprises filter reduction 
features with which the developer can specify exactly what 
information within the network is of interest. Filter reduc- 
tion is used to narrow the scope of the filter to extract only 
the information of interest and hence reduce the perfor- 
mance impact of monitoring. 

In another aspect, the invention comprises filter combi- 
nation features with which different users can specify indi- 
vidual filters that can be combined. The LEC can be multi- 
threaded and combine filters submitted by multiple users. 

In another aspect, the invention comprises a filter user 
interface which is a graphical representation of the 
machines, entities, and events making up the network. The 
user can easily pick those of interest, using displayed lists 
and Boolean operator tabs, or can simply write an order in 
text format which is converted to the appropriate filter. 
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In another aspect, the invention comprises APIs for 
registration, in-process event creators, dynamic event 
creators, and other functions implementing the various 
aspects of the invention. 

In another aspect, the invention provides for the automatic 
generation of an animated application model of the process 
under examination. A dynamic diagram of the application is 
automatically displayed as the various constituents interact. 
A video cassette recorder (VCR) paradigm is used to "play, 
replay, stop, pause, change speed, and reverse" the display, 
to enable the user to see what's happening as the application 
executes. 

In another aspect, the invention provides for automatic, 
synchronized display of all performance analysis data. A 
number of user-customized, synchronized display windows 
show the constituent parts of the application execution and 
the corresponding performance characteristics, in both Gantt 
chart and graphical modes, either in real-time or post- 
mortem. A timeline window displays a visual representation 
of the timing of all related events. A summary window 
displays a distillation of the system performance during a 
user-selected time slice. 

In another aspect, the invention provides suitable data 
security mechanisms throughout the network being moni- 
tored. Discretionary access is applied to the collection of 
data from a specific machine. 

The present invention describes systems, clients, servers, 
methods, and computer-readable media of varying scope. In 
addition to the aspects and advantages of the present inven- 
tion described in this summary, further aspects and advan- 
tages of the invention will become apparent by reference to 
the drawings and by reading the Detailed Description that 
follows. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The invention is pointed out with particularity in the 
appended claims. However, other features of the invention 
will become more apparent and the invention will be best 
understood by referring to the following Detailed Descrip- 
tion in conjunction with the accompanying drawings in 
which: 

FIG. 1 illustrates a hardware and operating environment 
in conjunction with which embodiments of the invention can 
be practiced; 

FIG. 2 illustrates a system-level overview of an exem- 
plary embodiment of the invention; 

FIG. 3 illustrates a machine-level overview of an exem- 
plary embodiment of the invention; 

FIG. 4 illustrates in schematic fashion pre-defined event 
fields and custom fields, which are included in an event 
packet within an exemplary embodiment of the invention; 

FIG. 5 illustrates a transition between two entities within 
the hardware and operating environment; 

FIG. 6 is a table which illustrates how pre-defined event 
fields are used to establish a relationship between a source 
and a target entity; 

FIG. 7 illustrates in schematic fashion how events 
selected by a user are monitored. 

FIG. 8 illustrates a process of filter reduction as used 
within an exemplary embodiment of the invention; 

FIG. 9 illustrates a process of filter combination as used 
within an exemplary embodiment of the invention; 

FIG. 10 illustrates another process of filter combination as 
used within an exemplary embodiment of the invention; 
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FIG. 11 illustrates a screen print of an exemplary user 
interface for specifying a filter; 

FIG. 12 illustrates a system level overview of an exem- 
plary embodiment showing where APIs of the present inven- 
5 tion can appear within the software architecture of a dis- 
tributed computing system; 

FIG. 13 illustrates a screen print of an animated applica- 
tion model which the present invention generates to show 
the structure and activity of an application whose perfor- 
10 mance is being studied; 

FIG. 14 illustrates various user interface features of an 
animated application model in an exemplary embodiment of 
the invention; 

15 FIG. 15 illustrates a representative display of performance 
data in an exemplary embodiment of the invention; 

FIG. 16 illustrates a screen print of an exemplary display 
of performance data; 

FIG. 17 illustrates screen print of a timeline display of 
20 performance data; 

FIG. 18 illustrates a screen print of summary display of 
performance data; 

FIG. 19 illustrates a screen print of several synchronized 
2J sets of performance data; 

FIG. 20 A-C is a flowchart of a method illustrating an 
exemplary embodiment of overall data collection architec- 
ture and how data is collected via the IECs, DECs, and 
LECs; 

30 FIG. 21 A-B is a flowchart of a method illustrating an 
exemplary embodiment of overall data design and how the 
VSA determines and maps relationships between entities; 

FIG. 22 A-B is a flowchart of a method illustrating an 
exemplary embodiment of triggers; 
35 FIG. 23 A-B is a flowchart of a method illustrating an 
exemplary embodiment of filter reduction; 

FIG. 24 A-B is a flowchart of a method illustrating an 
exemplary embodiment of filter combination; 

FIG. 25 A-B is a flowchart of a method illustrating an 
exemplary embodiment of a user interface for specifying 
one or more filters; 

FIG. 26 A-C is a flowchart of a method illustrating an 
exemplary embodiment of automatic generation of an ani- 
45 mated application model; and 

FIG. 27 A-C is a flowchart of a method illustrating an 
exemplary embodiment of a user interface for displaying the 
performance analysis of the system under examination. 

DETAILED DESCRIPTION OF THE 

50 INVENTION 

In the following Detailed Description of exemplary 
embodiments of the invention, reference is made to the 
accompanying drawings that form a part hereof, and which 

55 show by way of illustration specific exemplary embodiments 
in which the invention can be practiced. These embodiments 
are described in sufficient detail to enable those skilled in the 
art to practice the invention. It is to be understood that other 
embodiments can be utilized and that logical, mechanical, 

60 electrical, and other changes can be made without departing 
from the spirit and scope of the present invention. The 
following Detailed Description is, therefore, not to be taken 
in a limiting sense, and the scope of the present invention is 
defined only by the appended claims. 

65 The Detailed Description is divided into six sections. In 
the first section, a Glossary of Terms is provided. In the 
second section, an Exemplary Hardware and Operating 
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Environment in conjunction with which embodiments of the 
invention can be practiced is described. In the third section, 
a System Level Overview of the invention is presented. In 
the fourth section, Exemplary Embodiments of the Invention 
are provided. In the fifth section, Methods of Exemplary 
Embodiments of the Invention are provided. Finally, in the 
sixth section, a Conclusion of the Detailed Description is 
provided. 

Glossary of Terms 
The following section provides definitions of various 

terms used in the Detailed Description: 

ADO — ActiveX® Data Objects, a high-level programming 
interface from Microsoft Corporation for data objects 
which can be used to access different types of data, 
including web pages, spreadsheets, and other types of 
documents. It is designed to provide a consistent way of 
accessing data regardless of how the data is structured. 

API — Application Program Interface, a language and mes- 
sage format used by an application program to commu- 
nicate with the operating system, middleware, or other 
system program such as a database management system. 
APIs are generally implemented by writing function calls 
in the application program, which provide the linkage to 
a specific subroutine for execution. Operating environ- 
ments typically provide an API so that programmers can 
write applications consistent with the operating environ- 
ment. 

COM — Component Object Model, a component software 
architecture from Microsoft Corporation which defines a 
structure for building program routines or objects that can 
be called up and executed in a Microsoft Windows® 
operating system environment. 

DCOM — Distributed Component Object Model, developed 
by Microsoft Corporation, it is an extension of the Com- 
ponent Object Model (COM), which enables object- 
oriented processes distributed across a network to com- 
municate with one another. 

Entity — a functional component in a data processing system, 
such as a client, server, or data source. 

GUID — a Globally Unique Identifier within a data process- 
ing system. Within the present invention it is used to 
identify, for example, a COM object, an event source, an 
event, an event category, and any other system object that 
requires guaranteed unique identification from multiple 
independent generators. 

Maclrme — a minimal data processing system comprising at 
least a processor and a memory, the processor executing 
software instructions which are stored in the memory. 

Middleware — a category of processes between the applica- 
tion itself and backend processes such as databases, 
network connections, and so forth. Applications that run 
on currently available operating systems typically require 
services above and beyond those provided by the oper- 
ating system. These services are often no longer written 
by the application developer but by a third party (which 
can be the operating system vendor). The term "middle- 
ware" indicates the position of these common services 
within the software architecture relative to the applica- 
tion. 

MTS— -Microsoft Transaction Server (MTS), a feature of the 
Microsoft Windows NT Server® operating system that 
facilitates the development and deployment of server- 
centric applications built using Microsoft's Component 
Object Model (COM) technologies. 

NTS— Windows NT Server®, a version of the Microsoft 
Windows® operating system. There are currently two 
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commercially available versions of Windows NT: Win- 
dows NT Server®, designed to act as a server in networks, 
and Windows NT Workstation® for stand-alone or client 
workstations. 

5 PerfMon — Performance Monitor, a utility provided with 
Microsoft Corporation's Windows NT® operating system 
which enables the performance monitoring of all services 
running on a system. 
RPC — Remote Procedure Call, a programming interface 

10 that allows a program on one computer to execute a 
program on a server computer. Using RPC, a system 
developer need not develop specific procedures for the 
server. The client program sends a message to the server 
with appropriate arguments, and the server returns a 

is message containing the results of the program executed. 
Windows® operating system — an operating system com- 
mercially available from Microsoft Corporation for sev- 
eral different computing platforms. 

20 Exemplary Hardware and Operating Environment 

FIG. 1 illustrates a hardware and operating environment 
in conjunction with which embodiments of the invention can 
be practiced. The description of FIG. 1 is intended to provide 
a brief, general description of suitable computer hardware 
and a suitable computing environment with which the inven- 
tion can be implemented. Although not required, the inven- 
tion is described in the general context of computer- 
executable instructions, such as program modules, being 
executed by a computer, such as a personal computer (PC). 

30 This is one embodiment of many different computer 
configurations, some including specialized hardware circuits 
to analyze performance, that can be used to implement the 
present invention. Generally, program modules include 
routines, programs, objects, components, data structures, 
etc. that perform particular tasks or implement particular 
abstract data types. 

Moreover, those skilled in the art will appreciate that the 
invention can be practiced with other computer-system 

40 configurations, including hand-held devices, multiprocessor 
systems, microprocessor-based or programmable consumer 
electronics, network personal computers ("PCs"), 
minicomputers, mainframe computers, and the like. The 
invention can also be practiced in distributed computing 

45 environments where tasks are performed by remote process- 
ing devices linked through a communications network. In a 
distributed computing environment, program modules can 
be located in both local and remote memory storage devices. 
FIG. 1 shows a general-purpose computing or 

50 information-handling system 80. This embodiment includes 
a general purpose computing device such as personal com- 
puter (PC) 20, that includes processing unit 21, a system 
memory 22, and a system bus 23 that operatively couples the 
system memory 22 and other system components to pro- 

55 cessing unit 21. There may be only one or there may be more 
than one processing unit 21, such that the processor com- 
puter 20 comprises a single central-processing unit (CPU), 
or a plurality of processing units, commonly referred to as 
a parallel processing environment. The computer 20 can be 

60 a conventional computer, a distributed computer, or any 
other type of computer; the invention is not so limited. 

In other embodiments other configurations are used in PC 
20. System bus 23 can be any of several types, including a 
memory bus or memory controller, a peripheral bus, and a 

65 local bus, and can use any of a variety of bus architectures. 
The system memory 22 may also be referred to as simply the 
memory, and it includes read-only memory (ROM) 24 and 
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random-access memory (RAM) 25. A basic input/output 
system (BIOS) 26, stored in ROM 24, contains the basic 
routines that transfer information between components of 
personal computer 20. BIOS 26 also contains start-up rou- 
tines for the system. 5 

Personal computer 20 further includes hard disk drive 27 
having one or more magnetic hard disks (not shown) onto 
which data is stored and retrieved for reading from and 
writing to hard-disk-drive interface 32, magnetic disk drive 
28 for reading from and writing to a removable magnetic 1Q 
disk 29, and optical disk drive 30 for reading from and/or 
writing to a removable optical disk 31 such as a CD-ROM, 
DVD or other optical medium. Hard disk drive 27, magnetic 
disk drive 28, and optical disk drive 30 are connected to 
system bus 23 by a hard-disk drive interface 32, a magnetic- 
disk drive interface 33, and an optical-drive interface 34, 15 
respectively. The drives 27, 28, and 30 and their associated 
computer-readable media 29, 31 provide nonvolatile storage 
of computer-readable instructions, data structures, program 
modules and other data for personal computer 20. Although 
the exemplary environment described herein employs a hard 20 
disk, a removable magnetic disk 29 and a removable optical 
disk 31, those skilled in the art will appreciate that other 
types of computer-readable media which can store data 
accessible by a computer can also be used in the exemplary 
operating environment. Such media may include magnetic 2 5 
tape cassettes, flash-memory cards, digital video disks 
(DVD), Bernoulli cartridges, RAMs, ROMs, and the like. 

In various embodiments, program modules are stored on 
the hard disk drive 27, magnetic disk 29, optical disk 31, 
ROM 24 and/or RAM 25 and can be moved among these 30 
devices, e.g., from hard disk drive 27 to RAM 25. Program 
modules include operating system 35, one or more applica- 
tion programs 36, other program modules 37, and/or pro- 
gram data 38. A user can enter commands and information 
into personal computer 20 through input devices such as a 3s 
keyboard 40 and a pointing device 42. Other input devices 
(not shown) for various embodiments include one or more 
devices selected from a microphone, joystick, game pad, 
satellite dish, scanner, or the like. These and other input 
devices are often connected to the processing unit 21 40 
through a serial-port interface 46 coupled to system bus 23, 
but in other embodiments they are connected through other 
interfaces not shown in FIG. 1, such as a parallel port, a 
game port, or a universal serial bus (USB) interface. A 
monitor 47 or other display device also connects to system 45 
bus 23 via an interface such as a video adapter 48. In some 
embodiments, one or more speakers 57 or other audio output 
transducers are driven by sound adapter 56 connected to 
system bus 23. In some embodiments, in addition to the 
monitor 47, system 80 includes other peripheral output so 
devices (not shown) such as a printer or the like. 

In some embodiments, personal computer 20 operates in 
a networked environment using logical connections to one 
or more remote computers such as remote computer 49. 
Remote computer 49 can be another personal computer, a 55 
server, a router, a network PC, a peer device, or other 
common network node. Remote computer 49 typically 
includes many or all of the components described above in 
connection with personal computer 20; however, only a 
storage device 50 is illustrated in FIG. 1. The logical 60 
connections depicted in FIG. 1 include local-area network 
(LAN) 51 and a wide-area network (WAN) 52, both of 
which are shown connecting PC 20 to remote computer 49; 
typical embodiments would only include one or the other. 
Such networking environments are commonplace in offices, 65 
enterprise-wide computer networks, intranets and the Inter- 
net. 
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When placed in a LAN networking environment, PC 20 
connects to local network 51 through a network interface or 
adapter 53. When used in a WAN networking environment 
such as the Internet, PC 20 typically includes modem 54 or 
other means for establishing communications over network 
52. Modem 54 may be internal or external to PC 20 and 
connects to system bus 23 via serial-port interface 46 in the 
embodiment shown. In a networked environment, program 
modules depicted as residing within PC 20 or portions 
thereof may be stored in remote-storage device 50. Of 
course, the network connections shown are illustrative, and 
other means of establishing a communications link between 
the computers can be substituted. 

Software can be designed using many different methods, 
including object-oriented programming methods. C++ and 
Java are two examples of common object-oriented computer 
programming languages that provide functionality associ- 
ated with object-oriented programming. Object-oriented 
programming methods provide a means to encapsulate data 
members (variables) and member functions (methods) that 
operate on that data into a single entity called a class. 
Object-oriented programming methods also provide a means 
to create new classes based on existing classes. 

An object is an instance of a class. The data members of 
an object are attributes that are stored inside the computer 
memory, and the methods are executable computer code that 
act upon this data, along with potentially providing other 
services. The notion of an object is exploited in the present 
invention in that certain aspects of the invention are imple- 
mented as objects in some embodiments. 

An interface is a group of related functions that are 
organized into a named unit. Some identifier can uniquely 
identify each interface. Interfaces have no instantiation; that 
is, an interface is a definition only without the executable 
code needed to implement the methods that are specified by 
the interface. An object can support an interface by provid- 
ing executable code for the methods specified by the inter- 
face. The executable code supplied by the object must 
comply with the definitions specified by the interface. The 
object can also provide additional methods. Those skilled in 
the art will recognize that interfaces are not limited to use in 
or by an object-oriented programming environment. 

System Level Overview 
FIG. 2 illustrates a system-level overview of an exem- 
plary implementation of the invention. The invention has 
utility in the area of data processing, where it can be used to 
analyze the performance of a data processing system, and in 
particular application software, whether under development, 
undergoing testing, or in full utilization. The invention is 
commercially available from Microsoft Corporation as the 
"Visual Studio"® development system or "Visual Studio 
Analyzer"®. In addition, certain portions of the invention 
are provided within the Microsoft Windows® operating 
system. 

The "Visual Studio" development system collects appli- 
cation data by use of instrumentation within the application 
environment in an efficient, distributed collection architec- 
ture. Any application built with any development tool can be 
automatically analyzed and diagnosed, provided it uses 
standard middleware and operating system components. 
There is no requirement for any changes to the application 
itself. 

As mentioned in the Background section earlier, distrib- 
uted data processing systems can be relatively simple or 
extremely complex. The developer of software operating on 
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a distributed data processing system is usually faced with events generated by the IECs and DECs and sends these 

serious challenges in understanding the functional operation events to the user's control station, VSA 100, for analysis 

and behavior of such software as it is executing. and display in a user-determined format. 

The system illustrated in FIG. 2 is a globally distributed IECs and DECs reside in the process space of data sources 

system in which different machines 100, 102, 104, 106, and s within a machine, and they "report on" these data sources. 

108 are physically located on several different continents. They each do this by creating events that are sent to and 

These machines are shown as interconnected via hardwire, collected by the LEC. They are active only when the user is 

fiber-optic cable, radio frequency, or other suitable links interested in knowing about these events and in understand- 

130, 132, 134, and 136 in an arbitrary network arrangement ing the system performance. 

spanning a large portion of the globe. The difficulties in 10 ffiCs md DECs ^ their purpose ^ m c creates an 

understanding and trouble-shooting systems of this com- event when a user-specified condition (other than time- 

plexity have been significant until the present invention. valued data ) occurs . ^ examp i e could be "a COM event in 

The present invention enables complex distributed appli- Machine A". A DEC, on the other hand, creates an event to 

cations to be readily understood and analyzed, notwithstand- reflect data whose value is measured on a periodic or time 

ing that the machines on which they are running may be 15 basis. An example could be Perfson data reflecting CPU 

thousands of miles apart, and notwithstanding that the utilization. 

developer may not have access to source code for the As mentioned in the Summary section above, the system 

underlying software upon which his or her application is described herein for analyzing the performance of a data 

running. processing system is a comprehensive one with many dif- 

With reference to FIG. 2, the box identified as VSA 100 f er ent aspects, each of which will now be described in the 

is a control and display station that comprises computer section below entitled Exemplary Embodiments of the 

hardware and software. VSA 100 is coupled to one or more Invention, 
machines, e.g. machines 102, 104, 106, and 108. Each 
machine includes a Local Event Concentrator (LEC) 112, 
152. One LEC is provided per physical machine, although in 
a different implementation more could be provided if 

desired. VSA 100 activates an LEC when it wants that LEC Collection, Capture & Transmission of Data 

to start collecting events, and VSA 100 deactivates an LEC _ , „,•,■•., TT ^ . tt-^ • , 

when it wants it to stop collecting events. In addition to VSA th Data C f ? C ^ ^ f , f " * T^T ? 

100, other client machines can also activate or deactivate an 30 ! bat mar f ak , the deSired [ *f 1Dt A ° a S P eCial f ° mat and P U ' S 

I EC 112 or 15 7 it in a shared memory buffer. As mentioned above, IECs 

„ , . - ~ . . reside in the process space of a data source. 

Each LEC 112, 152 is coupled to a respective process , x . . A . . . 

space 110, 150. Each process space 110, 150 can each ^ IEC ^ x P or A ts . tw ° mam ^tions: IsActive and 

comprise a group of In-process Event Creators (IECs), such „ F^Event. The IsActive function is used by data so— - 

as IECs #1.1 through #1.N in s^nup 110. determine if any analysis is being performed a| 



... .„ . . , particular data source. When a piece of code reaches a point 

Each LEC 112 152 is further coupled to a respective £ f ^ ^ , sActiye k which ^ 

process space 114, 154 Each process space 114, 154 can Jmo m ^ ag tQ whether Qr nQl c ^ MeKS Xed. If the 

each comprise a group of Dynamic Event Creators (DECs), feActive ^ condition ^ ^ Tm& for & ^ ^ 

such as DECs #11 through #1.N in group 114. Process 40 sourc ^ fc ^ {o h m ^ 

spaces 110 and 114 can be identical or different for machine {o ^ centra]ized tem of the requesting ^ If 

104, likewise for the process spaces 150, 154 associated feActive m caQ reduce adverse 

with machine 106. Whrie all DECs are shown in FIG. 2 as - • - J • - - -• - 



, . performance impact by not formatting data for FireEvent. 

residing in process spaces 114, 154, m one embodiment ^ FireEvent faM&)I1 is imp i emented both a synchro . 

DECs that capture global machine state (such as PerfMon 45 nQus an(J ^ a chronous mimner ^ the , in 4 ntion . 

data) reside only within the LEC process space. rl ' , . ,.,,.„„. 

' 1 v When an LEC has been activated by the VSA 100, it can 

Machine-Level Overview turn an IEC on or off, i.e. it switches its IsActive status to 

FIG. 3 illustrates a machine-level overview of an exem- Tme or False - ^ Boolean status is maintained in the 

plary embodiment of the invention. In FIG. 7 three major „ process, so there are really never any in-process transitions, 

portions of the process space of a machine are shown in the and the code never changes. When IsActive is True, events 

form of Applications 190, Operating System 191, and Addi- arc generated. When the VSA 100 user wants to stop 

tional Components 192 monitoring events, everything can be quickly disconnected. 

In one aspect, the invention comprises one local event feActive fe ^ to False ' and the a PP^™ never changes, 

concentrator (LEC) 199 for each machine. Applications 55 Also, when an LEC has been activated by VSA 100, it can 

portion 190 has an IEC 193 associated with it; Operating mm a DEC on or off > depending upon whether the DEC is 

Systems portion 191 has an IEC 195 associated with it; and to collect events. When a DEC is to stop collecting events, 

Additional Components portion 192 has an IEC 197 asso- an LEC simply turns it off. As for IECs, an LEC starts and 

ciated with it. stops DECs as specified by a user-specified filter, as will be 

There is at least one dynamic event creator (DEC) per eo dis ciissed further below, 

machine, such as DEC 189, which is in the process space of Instead of turning individual IECs on and off, a portion of 

LEC 199. It will be apparent to one of ordinary skill in the the IECs or all of the IECs can be turned on or off. The same 

art that DECs could be provided for each portion 190, 191, applies to other structures of the invention, including DECs 

192 of the machine's process space. This is shown in FIG. and LECs. 
3 by DEC boxes 194, 196, 198 having dashed lines. 

Events created by IECs 193, 195, 197 and DECs 189, 194, 

196, 198 are collected by LEC 199. The LEC 199 collects actually begins collection of events. IECs are only created 
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for users who desire to monitor system performance. They 
are automatically created when needed. This ensures that, if 
the system is not under analysis, the performance impact of 
operating the performance analyzer is negligible. 
Additionally, the system is able to remove all of the IECs s 
from memory when analysis completes, so that a system 
wherein analysis has finished behaves with the same char- 
acteristics as before performance began, unlike many tradi- 
tional tools. 

IECs and DECs are created by the operating system, 10 
middleware, and application components that are sourcing 
the events. The creation of an IEC will now be described. 
Assume that a middleware entity wants to fire events. It asks 
the operating system to create an IEC. The operating system 
creates an IEC "reference", ready for the IEC in case the user 15 
wants to start monitoring data. When the user wants to start 
monitoring data, the LEC tells the operating system to 
convert the IEC "reference" into a real IEC. The operating 
system converts all the IEC references into real IECs the first 
time they are used. 20 

Events from IECs in process spaces 110, 150 are passed 
to a respective LEC 112,152 via shared memory buffers. 
This allows the event to be communicated without requiring 
a process context switch. Each IEC has its own buffer in 
shared memory, to ensure that conflicts between events and 25 
locking do not distort system performance. 

In one currently implemented embodiment there is only 
one LEC per machine. It collects events from all IECs in all 
processes on the system that are being analyzed, and it sends 3Q 
the desired events back to the VSA 100. Since this commu- 
nication is likely to be cross-machine, an efficient batching 
mechanism is used to reduce network traffic, and transmis- 
sion is scheduled for low-system load times. To ensure 
efficient dispatch of events across the network, the LEC 35 
process runs at a lower than normal priority. This means that 
events will tend to be flashed across the network when the 
machine is not busy running the real application or when the 
real application is blocked, e.g., when it is waiting for data 
to be read from disk. To further reduce performance impact, 4Q 
events from many IECs are collected together and will not 
be sent more than some fixed period of time, e.g. every 
one-half to one second in one embodiment. If the number of 
events to be sent exceeds the buffering capacity, events will 
either be sent immediately or thrown away, depending upon 45 
a setting made at the control station. 

Communication between the VSA 100 and the LECs also 
exists to establish clock skews so that event times through- 
out the distributed application can be synchronized. Any 
known clock skew calculators can be used for this purpose. 50 

A DEC is similar to an IEC except that it deals with data 
whose value can be measured continuously, and whose 
values need to be recorded at regularly scheduled intervals. 
To reduce system complexity and increase flexibility in 
handling data, these "measured" events are treated internally 55 
just like events that are triggered by the system's behavior. 
This allows collection, synchronization, and analysis of both 
event-driven and time-driven data. 

As opposed to an IEC which reports on the occurrence of 
events (i.e. "this thing happened"), a DEC gathers informa- 60 
tion on a time basis, such as memory usage within the 
system, not necessarily events coming from within the 
application. For example, a DEC might every second mea- 
sure the memory usage of the system and send back an event 
that says "current memory usage is 2 megabytes". A DEC 65 
could also report on disk usage or CPU usage. A DEC could 
be created within the application itself to measure 
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application-specific parameters such as, for example, the 
number of queries currently executing within a database 
system or the number of words currently misspelled in a 
word-processing document. Generally speaking, a DEC can 
measure any continuously varying data, i.e., anything which 
could be represented by a graph. 

The VSA 100 collects all reported information and stores 
it in an efficient centralized store. The centralized store can 
simply be a data file in which data is organized in a certain 
way, i.e. a memory-mapped file. Other embodiments of an 
efficient data store could be a relational database, an 
in-memory data structure, a regular file, or any other suitable 
structure which can handle large volumes of data with an 
efficient access time. 

Once written to, it can be read many times. Data is 
organized so it's easy to write, since incoming data volume 
can be very high, and also so it's easy to read directly from 
disk, since dataset size will typically preclude loading all 
data into memory. 

Since data collection for one embodiment of the invention 
doesn't involve a multiple update problem, this was taken 
into consideration in designing the data structure. File- 
mapped memory buffers were used so that information could 
be quickly retrieved from disk and stored into memory in an 
efficient way. Thus the system is able to receive potentially 
many thousands of events per second. It is stored on disk in 
the order that it arrives. 

It will be apparent to one of ordinary skill in the art that 
the present invention is equally applicable to a distributed 
system in a single machine. A single machine can be running 
more than one process, for example an operating system and 
a data base application. 

It will further be apparent to one of ordinary skill that if 
the performance cost of a context switch is not of great 
concern, then it could in fact be carried out, provided that 
one appropriately factors it into the performance analysis. 

It will be appreciated that just because the LEC is col- 
lecting something doesn't mean that it is necessarily what 
the VSA user wants. As will be explained below, user- 
specified filtering can occur in the IEC or in the LEC to 
reduce the information. In addition, the LEC, in a currently 
implemented embodiment, can buffer all or a substantial 
portion of the information that it sends out to the VSA, so it 
sends bursts on the network rather than continuous traffic. In 
addition, it can also run as a lower priority, so it's buffering 
up all of the information rather than directly slowing down 
the application. In addition, it can further compress data to 
reduce network overhead. 

Operation of VSA 

The operation of the VSA will now be described. When an 
application starts up, the operating system software or the 
"middleware" that the application is using creates an IEC 
reference, and if there's an LEC on the system the IEC 
reference hooks itself up to the LEC. However, if no one is 
analyzing the system yet, there will be no LEC yet, and the 
IEC reference will remain unhooked up. 

Then the IEC reference goes into quiescent mode. The 
application keeps running, and nothing special is going on to 
slow it down. 

Now, if someone wants to analyze what's going on, they 
turn on the VSA 100, and they indicate that they want to 
hook up to a particular machine, so it turns on an LEC on 
that system. That LEC connects to all of the IECs on that 
system, and it starts any DECs, for example to monitor CPU 
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usage. DECs measure and report on time-based interval Pre-defined event fields are listed in Table 1 below: 

events, while IECs watch for and report on the occurrence 

of events. It will be apparent to one of ordinary skill that TABLE 1 

while the LEC is created by the VSA 100 in a currently 

implemented embodiment, it could be automatically created 5 

when the first IEC reference exists. 

Causality ID 

The VSA user specifies what information is to be col- CorrelationID 
lected. That information is broken down per machine and DynamicEvent Data 

passed to the LEC for each machine. The LEC then breaks 10 ReturnValue 
that down, per IEC, and basically turns the IECs on or off 
where appropriate. When IsActive is set True in an IEC, it 
is no longer quiescent, and that IEC starts sending collected 
data to its associated LEC. When the user shuts down the 
VSA, the IECs, DECs, and LECs revert back to their 15 
quiescent states. 

The interface between the VSA and an LEC can operate TargetComponent 
under DCOM. Everything else can run under COM, except TargetHandie 
for the shared memory communication between the IEC and TargetProce™ 
the LEC. The IEC writes information into a shared memory 20 T^rgetProcessName 
buffer and never takes a process context switch. COM is mugetsession 
used basically only for initialization. TbrgetThread 

A third party developer is able to write a COM interface Entity 

for its application and use the VSA to analyze its perfor- 25 lDStance 

mance. It doesn't have to link any additional libraries. 

^ „ ~ „ , „ , , , Because the default set of events is large, pre-defined 

Data Design-Pre-Defined Event Fields and event categories are provided t0 visuaUy organize the events 

Custom Fields ^ ^ filter editor Each event bekmgs t0 exactly one 

. , c , . , - , t 30 category, and each category may have any number of 

FIG. 4 Jlustrates m schematic fashion pre-defined event ^ Each ^ bjm mmbcr 

fields and custom fields which are included in an event of ^ cat rics . xhc combination of al i of thc CV6nls and 

packet within an exemplary embodiment of the invention. Q ^ ^ a ^ ^ ^ leayes m Ms ^ ^ 

Pre-defined event fields are generally always present m an ^ cat ries . Event cal ies have no semaatic 

event p ,1 1 whether the user specifies them or not. Custom 35 on the event but do aUow lhe filter lo be displayed , 

fields can also be assigned by a user. In the invention each J j&6 an(J ^ mQre efficient , Event caieg0T £s have 

event may include a number of pre-defined or standard mere , m anizational ^ in that th hel the user 

pieces of information, as well as custom or arbitrary user- understand events 
specified information. This information becomes important 

when filter reduction occurs, as will be described further 40 Pre-defined event categories are listed in Table 2 below: 
below. 



As shown in FIG. 4, a VSA event comprises pre-defined 



event fields 160 and custom fields 162. Not all pre-defined Pre-Defined Event Categories 

event fields have to be provided for every event. Pre-defined 4J 

event fields 160 enable the data structure of the invention. If 
the user doesn't specify pre-defined event fields, intelligent 
default values are automatically provided for them. 

Custom fields 162 can be generated by the user, but none 

of them is essential to the data design. so 

What distinguishes pre-defined event fields from custom „ , . . , -n. . ■ -u 

- , , . j/j . c . jcj Each event has a type. The type is used to distinguish 

fields is that pre-defined event fields have pre-defined . t r £.rr< -n. n. • 1 j* 

1 ,« c , ., , • , events that come from DECs. The event type is also used to 

semantics and are therefore useable by the analysis mecha- ^ ^ ^ ^ m ^ 

msm to determine the interrelationship among events. With- sj fr om ^ ^ ^ inbound AyE ^ RETURN) ^ 

out pre-defined event fields, the analysis mechanism would ,. ,. .. .. ... \ u- .u * cr 

, F , . , , ' , , j j . . distmction is important to matching up the steps of four 

be unable to make any reasonable deductions about the . . f. . ,. oatt ^FrrLrAw/ 

. ., , J . , , , . .. , - events mentioned later regarding a CALL/ENTER/LEAVE/ 

events and would only be able to provide a useless hst of nrTlmKr 1C b ° , , ... c ., 

events. Further, the set of pre-defined event fields is opti- RETURN sequence. If an event belongs to either of these 

mized for effective and efficient analysis. The specific names w cate S° nes - then 11 * called S enenc " 

and functions are described in Table 1 below. Event types are unrelated to event categories. Events of 

„ , , „ , ,, ,. the same type may be in different categories, and, 

Some important pre-defined event fields are the Machine, , r . . / cj-cc 

Process, Entity (referred to as "Component" in Table 1 ^ m ^ Categ ° neS may be ° f 

below and in the APIs), Instance (referred to as "Session" in ™ 

Table 1 below and in the APIs), and Handle fields, both for 65 There are different types of events. The event type is used 

the Source as well as for the Target. Their use will be to specify how VSA 100 should interpret the event. Event 

explained in greater detail below. types are listed in Table 3 below: 
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TABLE 3 
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a grouped event). 

s. Outbound means th 



The data design of the present invention allows the user 
to define his or her own events and event taxonomy. 
However, to provide some basic interoperability between 
data (so that generic analysis tools can be written and/or 
used), in one embodiment of the invention some typical 
events are defined. Compliant event generators within this 
embodiment are encouraged to use these events rather than 
to define their own. This helps simplify the filter editor. „ 
Alternative embodiments could either have no typical events 
or a very large set of typical events. The choice of typical 
events is merely dictated by the kind of events that are 
expected to be common within the embodiment of the 
invention which is implemented. ; 

Table 4 below identifies pre-defined events and their 
categories and types: 



"Component Stop" means a component has been 
destroyed and is stopping its execution (note the com- 
ment above). 

"Enter" means the second step in a four-step transition. A 

function call is arriving at the callee. 
"Enter Data" means subsidiary data to an Enter has been 

received. 

"Events Lost" means the system has had to discard events 

to avoid overloading the eventing infrastructure. 
"Leave Data" means subsidiary data to a leave has been 

transmitted from a callee to the caller. 
"Leave Exception" means an exception (error) has been 

transmitted from the callee to the caller. This is the third 

step in the four-part transition. 
"Leave Normal" means a success has been transmitted 

from the callee to the caller. This is the third step in the 

four-part transition. 
"Query Enter" means a database query has arrived at the 



TABLE 4 



Call Data 
Component Start 
Component Stop 
Enter 
Enter Data 

Leave Data 
Leave Exception 
Leave Normal 
Query Enter 
Query Leave 
Query Result 
Query Send 

Return Exception 



Call/Return 
Call/Return 
Call/Return 
Query/Result 



Query/Result 
Query/Result 
Call/Return 

Call/Return 

Transaction 



Outbound 
Outbound 
Inbound 
Outbound 
Inbound 
Outbound 
Inbound 



In Table 4, the "Category" descriptors are merely 
annotational, not semantic. 

A brief description of each Event listed in the "Event" 
column will now be given: 

A "Call" event is the first step of a four-part Call/Enter/ 
Leave/Return transition. A function call is departing 
from a caller. 

"Call Data" means subsidiary data to a call is being 
transmitted. This always follows a Call. 

"Component Start" means a component has been created 
and is starting to execute (note that "component" in this 
sense is not the same as an "entity" as used herein; it 
means a real component). 



"Query Leave" means a database query has been com- 
pleted. 

"Query Result" means a database query result set has 

started transmitting back to the caller. 
"Query Send" means a database query has left the caller. 
"Return" means the fourth step in the four-part transition. 

Control has returned to the caller. 
"Return Data" means subsidiary data to a Return has been 
30 received at the caller. 

"Return Exception" means an exception (error) has been 
received at the caller. This is the fourth step in the 
four-part transition. 
35 "Return Normal" means a success has been received at 
the caller. This is the fourth step in the four-part 
transition. 

"Transaction Commit" means a transaction has been 

committed successfully. 
40 "Transaction Rollback" means a transaction was aborted. 
"Transaction Start" means a new transaction was created 

and started. 
"User" means an unknown event. 

45 Data Design— E0/E1 Entity Transition 

FIG. 5 illustrates a transition between two entities, E0 and 
El, within the hardware and operating environment. A 
"transition" occurs when one entity (e.g. a program, process, 
50 or object) turns execution over to another to complete a 
specific task. In FIG. 5 the illustrated transition comprises 
four events, a Call event, an Enter event, a Leave event, and 
a Return event. 
When understanding the structure and behavior of dis- 
ss tributed systems, understanding transitions between differ- 
ent applications entities is important. The VSA employs an 
innovative data design that allows two communicating enti- 
ties to describe their interactions despite knowing almost 
nothing about each other. Each participant in a transition 
60 provides only information about its environment, plus a 
unique identifier that allows the entity at the other end of the 
transition to link the pair of events. Every destination called 
needs to have a unique i.d., and every source of a Call has 
a unique i.d. In an embodiment which was implemented, 
65 these unique i.d.'s are GUIDs. 

This design has a number of benefits. First, because entity 
systems typically already include a quasi-unique identifier 
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for transitions, no extra information needs to be transmitted 
between the two entities. Second, each entity data load is 
reduced through less duplicated data. 

Each application is treated as a series of black boxes. A 
"transition" is defined as when an application moves from 5 
one of those boxes to another one. So if we have a Client and 
a Server, a transition occurs when we go to the Server, and 
another occurs when we go back. In a three-tier design, a 
transition occurs for Client to Server, Server to Database, 
Database to Server, and Server to Client movements. These 1° 
are entity to entity transitions and not necessarily machine to 
machine transitions. 

One example of an entity to entity transition is one COM 
client component calling a COM server component. Essen- 
tially four events represent that transition, which can be a 15 
remote procedure Call (RPC) within a distributed system. 
An event from the client says "I'm initiating a Call". An 
event at the server says "I've entered the server". An event 
at the server says "I'm leaving the server". And finally an 
event at the client says "I've returned". In the case of COM, 20 
an event occurs at both sides of the transition. 

By looking at all or nearly all of these events and taking 
appropriate pieces of information about them and correlating 
them, a great deal of information is derived about the 25 
structure and performance of the system, and accordingly a 
performance model of the system can be constructed. 



Data Design — Determination of Source/Target 
Relationship 



3(3 



FIG. 6 is a table which illustrates how pre-defined event 
fields are used to establish a relationship between a source 
entity and a target entity. 

For each of the events involved in a Call, Enter, Leave, 
and Return sequence, the event producer specifies the 35 
Machine of the source, the Process of the source, the Entity 
(e.g. class, such as ADO) of the source, and the Instance of 
the source. 

Thus the VSA knows the Machine, Process, Entity, and ^ 
Instance at the Source for a Call event, but it doesn't know 
the Machine, Process, Entity, and Instance at the Target for 
a Call event. And for the Enter event, the situation is 
reversed. The VSA doesn't know it for the Source, but it 
does know it for the Target. In almost all cases the events are ^ 
fired at the place the event is happening. 

Using this information the VSA is able to piece together 
a functional block diagram of the system as described below. 

There are basically two kinds of users that use VSA. 
There are people who give us events, and there is the actual 50 
end user who is collecting data to understand it. The data 
design of the invention is manipulated and used by the 
portion of the operating system that gives us events, and the 
end user doesn't really need to understand it in great depth. 
This format makes it possible to draw a block diagram of the 55 
system, even though no one piece knows what the system 
should look like. 

In most existing systems, E0 and El have a very weak 
relationship. The data design of the present invention is 
innovative in that it can tolerate this weak relationship and 60 
still provide useful results. E0 doesn't really need to know 
what machine El is on, and vice versa. Even though these 
two entities communicate through the system, e.g. via COM, 
they don't really know about each other. So when a Call 
event is fired by E0, it doesn't really know whom it's talking 65 
to. When El fires the Enter event that goes with that Call 
event, it doesn't really know that that Enter event goes with 



that Call event. So the small amount of information that the 
operating system has is leveraged to make sure that the Call 
event maps the Enter event. The Handle, the Correlation id., 
and the Causality i.d. fields are largely responsible for 
enabling an Enter event to be linked with a Call event. 

There are generally two kinds of events. There are asyn- 
chronous events, e.g. "this thing happened". And there are 
transition events, e.g. going from E0 to El. When you have 
a transition event, you typically have a transition back. The 
user firing the event specifies a Correlation i.d., which 
enables the Call event to be identified with the Return event. 
The Call and Return have the same Correlation i.d., and the 
Enter and Leave have their own Correlation i.d. Each 
Correlation pair matches up exactly one pair of Enter/Leave 
and Call/Return to enable the VSA to understand how to 
match up the pairs. 

Each event source has its own notion that correlates a 
CALL with a RETURN. For example, COM is able to 
generate a GUID based on the current execution context and 
processor. In an alternative embodiment, a Correlation i.d. 
could be generated using the time the CALL was made. 
Generation of a Correlation i.d. is typically simple but 
cannot really be generalized. Each IEC caller must pick its 
own scheme. Even within a currently implemented 
embodiment, several schemes for generating Correlation 
i.d.'s coexist. 

Another key piece of information is the Causality i.d. This 
is normally provided by COM, but any entity can provide its 
own value if desired. Whenever a COM RPC is created, a 
GUID is created for that RPC. That information is tracked 
around the network, e.g. for purposes of identifying when a 
circular reference has been created. For the purposes of the 
present invention, it is used to match things up. It's basically 
a unique i.d. to identify a particular stream of calls and to 
sort them out. It says that this Call goes with this Return, and 
that this Enter goes with this Leave. The VSA knows from 
the Causality i.d. that these are all somehow interrelated. 

In general, the Correlation i.d. operates on the events that 
are known to one machine, and the Causality i.d. operates on 
events that occur across machines. 

AHandle is a way of referencing an individual instance of 
an entity. Handles are used by a calling entity to call 
(reference) a particular instance of an entity. Thus, the 
calling entity knows what Handle it is calling, and the entity 
being called (the target) knows its own Handle. When this 
process is applied for both the source and the target (each of 
which will have its own Handle), it is possible to collect 
together four events into the standard group of CALL/ 
ENTER/LEAVE/RETURN. It is important to realize that 
any entity instance can have many different Handles that 
refer to it. For example, when A and C are both talking to B, 
Amight use the Handle "BAT" to refer to B, where C might 
use the Handle "BALL" to refer to B. 

From the information contained in the table shown in FIG. 
6, the VSA deduces that CaU 170 goes with Return 176, and 
that Enter 172 goes with Leave 174. The VSAknows they're 
related. By knowing that the Source Handle 180 for Call 170 
corresponds to Source Handle 186 for Enter 172, and that 
Target Handle 182 for Call 170 corresponds to Target 
Handle 184 for Enter 172, it knows that Call 170 is linked 
with Enter 172. In similar fashion, the VSA determines that 
Enter 172 is linked with Leave 174, and that Leave 174 is 
linked with Return 176. 

The table shown in FIG. 6 will now be described in detail 
to illustrate how a relationship can be deduced between a 
source entity and a target entity. The table of FIG. 6 shows 
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a standard four-event transition sequence. This sequence is 
not the only possible one but is merely one example. 

In this example, the CALL event fires, and the system is 
given fall information about the source but only knows the 
target Handle is HI. When the target fires the ENTER event, 
two deductions can be made: (1) the CALL event can now 
be filled in, and (2) Handle HI (the target) has now been 
defined to be Ml, PI, El, II. So the CALL event is now 
completely specified. Additionally, the ENTER event uses 
Handle HO which was previously defined to be MO, PO, EO, 
10, and so the ENTER event can be completely filled in too. 

When the LEAVE event arrives again from the target, two 
more deductions can be made: (1) the source information for 
the LEAVE event can be filled in by noticing that Handle HO 
has previously been defined to mean MO, PO, EO, 10, and (2) 
we can now deduce that this LEAVE event and the previous 
ENTER event are a pair, because they have the same 
Correlation i.d. (i.e. "CB"). 

When the final RETURN event arrives, three deductions 
can be made: (1) we can fill in the target information for the 
RETURN event, because we know that HI means Ml, PI, 
El, II, (2) we can pair this RETURN up with the previous 
CALL by noticing that the Correlation i.d. ("CA") matches 
that of the CALL event, and (3) all four events are a set 
because their Causality i.d. is the same, and they have two 
pairs of matching Correlation i.d.'s. 

The proper choice of a Handle depends in part on the 
entity causing the event. As in the case of a Correlation i.d., 
the generation of a Handle is typically simple but cannot 
really be generalized. Several routine schemes for generat- 
ing Handles exist within a currently implemented embodi- 
ment of the invention. 

It generally takes all three pieces of information together 
in context to create a functional diagram of how all of the 
pieces communicate. No single piece of information is vital 
to successful analysis. Dropping one or more fields still 
allows an implemented embodiment of the invention to 
generate useful analysis data. However, the removal of all 
source information makes it impossible to recognize a 
transition, for example, and thus impossible to diagram 
transitions in the system. Similarly, the loss of critical data 
such as the Correlation i.d. makes it impossible to draw a 
tree of events. 

It will be understood by one of ordinary skill that other 
options for ensuring that a source and a target can appro- 
priately identify themselves are possible. 

Triggers 

FIG. 7 illustrates in schematic fashion how events 
selected by a user are monitored. Triggers enable the VSA 
user to watch for a selected condition or error to occur. In 
many cases, a developer knows that an error will occur, but 
he or she doesn't know exactly when it will occur. The 
present invention allows the developer to set a trigger for 
collecting data in these situations. 

Triggers can be set either for conditions for which an IEC 
creates an event, such as "a COM event in Machine A", or 
for conditions for which a DEC creates an event, such as 
PerfMon data reflecting CPU utilization. 

The user can use Boolean operators, for example "OR" 
and "AND", to specify a set of two or more trigger condi- 
tions to watch. For example, a client can request to be alerted 
when a first designated CPU utilization OR a second des- 
ignated CPU utilization exceeds 75%. Alternatively, an alert 
could happen when CPU utilization exceeded 75% AND 
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disk utilization was less than 10%, potentially highlighting 
the need to obtain additional processing power. 

A developer can also specify a first filter for "normal" 
event-monitoring, and a second filter (which can be more 

5 detailed or comprehensive than the first filter) to apply when 
the trigger condition occurs. A "filter" is a way in which the 
system user can specify what is to be monitored in the 
system under examination. Filters will be discussed in 
greater detail below in the sub-sections entitled "Filter 

io Reduction", "Filter Combination", and "Filter Specifica- 
tion". 

In FIG. 7 an LEC 192 is depicted monitoring an appli- 
cation 190. Events created by IECs and DECs (not illus- 
trated in FIG. 7) are collected by LEC 192. Upon the 

15 occurrence of a trigger condition, LEC 192 dumps the events 
to the VSA 100 or else signals an alert to the VSA 100. 

While watching for one or more trigger conditions), 
event monitoring continues as usual, but data only requested 

20 by the trigger filter is not logged, while data requested by the 
monitoring filter continues to be logged as normal. 

While waiting for a trigger condition to occur, events are 
retained transiently by the LEC 192 in a circular buffer 
whose size can be specified by VSA 100. For example, VSA 

25 100 can specify that the buffer store 500 events, so when the 
501*' event comes in, the first event is written over. 

When the user's specified trigger condition is detected, 
the LEC 192 can immediately transmit all of the buffered 
events to the VSA 100 for logging. These provide data about 

30 the application prior to the failure or other condition. In 
addition, the LEC 192 can start collecting more events at a 
higher rate (in accordance with the second filter, for 
example) which events provide additional detailed informa- 
tion. 

35 VSA 100 can also specify a reset condition, either as pari 
of the second filter or as a separate filter. When the reset 
condition is met, the LEC 192 returns to the low-impacl 
minimal collection condition specified by the first filter and 
once again monitors for a trigger condition. 

40 It will be apparent to one of ordinary skill in the art that 
suitable data compression techniques can be applied to 
increase the efficiency of the event buffering and data 
transmission aspects of the invention. Data compression can 
be used both for storing events and for sending large 

45 quantities of events or event-related data through the data 
processing system. 

Data Security 

50 Information that is processed by a system performance 
analysis tool is likely to be confidential. Like any debugging 
tool, the VSA should ensure that the debuggability of the 
system cannot become a security hole. Additionally, VSA 
debugging is a shared resource in a distributed environment. 

55 As such, it is important that proper security precautions be 
taken to prevent malicious users from obtaining this data. 

The invention provides a secure environment for data 
collection through the use of discretionary access controls. 
These access controls can be applied, at the discretion of the 

60 user, to the collection of data from a specific machine, to the 
monitoring of specific entities, and to the collection of 
specific events. 

In one aspect of the invention VSA 100 is implemented as 
a DCOM server which can be configured to run as any 

65 identity, so it can control the resources and information it has 
access to. In addition, the server can run in a Windows NT 
authenticated domain, so that access to the server can be 
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controlled by discretionary access controls based on authen- 
tication identities. 

It will be apparent to one of ordinary skill in the art that 
discretionary access enforcement can be based on the pro- 
cesses desired to be monitored effectively. It will also be 
apparent to one of ordinary skill in the art that suitable 
encryption techniques can be employed to enhance security 
within the VSA. Since DCOM is used to communicate with 
the server, standard RPC encryption can be used. In addition, 
the use of COM's custom marshalling allows for any 
virtually any type of encryption technology to be used. 

Filter Reduction 
FIG. 8 illustrates a process of filter reduction as used 
within an exemplary embodiment of the invention. First, the 
use of filters within the context of the invention will be 
discussed. VSA users specify the desired information to 
monitor via a User Filter 200. That is, a filter defines what 
information the VSA will collect and analyze. Users can 
specify this information in a "system" scope, for example, 
"All COM and ADO events from Machines A and B". In 
addition to directing a filter to a machine, a filter can be 
directed to a process, component (e.g. ADO), EEC, DEC, 
event, thread, or to multiples or combinations of the fore- 
going. 

The user filter 200 can comprise a filter 202 for Machine 
A which in turn can comprise filters 204, 206, 208 for 
Processes Al, A2, A3, respectively. Likewise user filter 200 
can comprise a filter 212 for Machine B that in turn 
comprises filter 214, 216, 218 for Processes Bl, B2, B3, 
respectively. 

A filter can generally be expressed as a single Boolean 
expression in a set of unbound variables. These variables 
communicate to the data provider with events, and to the 
event sources and their categories. Using the example above, 
the filter would be (Machine=A OR Machine=B) AND 
(EventSource=COM OR EventSource=ADO). 

Filter reduction is a process employed by the VSA to 
extract portions of a filter relevant to specify a specific 
portion of the monitoring infrastructure. Using the previous 
example, the filter would be reduced by "Machine A" and 
then "Machine B" to determine the filter fragments that are 
specific to each machine. These fragments are transmitted to 
the LECs. The LECs, in turn, reduce the filter by the 
registered entities/processes on the system. The result is a 
filter fragment that can be used to determine if a specific data 
source is enabled or disabled. This information is commu- 
nicated to the IECs to provide the efficient IsActive function. 

Filter reduction is the process of modifying or creating a 
new version of a Boolean expression by binding a subset of 
the variables within the expression. For example, if the 
example filter above is sent to machine C, the Machine=A 
clause can be reduced to FALSE, and the Machine=B clause 
can be reduced to FALSE. Since the expression "FALSE 
AND anything" is FALSE, the whole expression evaluates 
to FALSE for machine C, meaning that all collection infra- 
structure on machine C can be deactivated. 

Another example of filter reduction would be to reduce 
the example filter ("All COM and ADO events from 
Machines A and B") by "Machine=A". This results in the 
filter "EventSource=COM OR EventSource=ADO". Thus 
the result of this filter reduction is a Boolean expression, not 
just a TRUE or FALSE expression. 

The LECs also make use of a specialized form of filter 
reduction to determine which dynamic data is desired. 
Collection and transmission of dynamic data is expensive, 
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and a filter is scanned for clauses that specifically refer to the 
dynamic information that is required. 

The VSA is communicating with multiple LECs, and to 
operate efficiently it reduces the filter from a global scale 
5 down to a filter for a particular machine. What goes into an 
LEC is that portion of the filter that pertains to a particular 
machine. 

At the next level the LEC breaks the information into 
pieces which are germane to each IEC to identify whether or 

io not that IEC should be turned on or off. So filter reduction 
occurs on at least two levels. The first level of filter reduction 
occurs at the VSA itself. The second level occurs at the LEC, 
which decides which IEC to turn on or off. It will be 
apparent to one of ordinary skill in the art that a third level 

15 could be at the IEC level. 

If at any point in the reduction the VSA determines that 
the filter is guaranteed to be False for a given machine, the 
collection mechanism is turned off on that machine. If a filter 
specifying "Machine=A and Process=7" is sent to Machine 

20 B, it's just False. Data collection for Machine B is left off 
and not turned on, which lets Machine B operate more 
efficiently. On Machine A the collection mechanism is left 
off for everything except Process 7. This is similar to binding 
variables in a Boolean expression. If it's either True or False, 

25 you know what to do. But if it's undefined, you have to send 
the expression further down the chain. This feature applies 
to processes and components as well. It will be apparent to 
one of ordinary skill in the art that it could be applied to any 
level, from the machine level down to the thread level. 

30 A machine-specific filter can be broadcast to a given 
machine. Generally, the reduction is performed at the client 
machine, and then the reduced filter is broadcast to specific 
machines. Again, it will be apparent to one of ordinary skill 
in the art that specific filters can be applied to any level. 

35 A third level of filter reduction can occur in the DEC. The 
DEC can specify exactly what pieces of information are 
being looked for. For example, an event monitoring appli- 
cation such as PerfMon can collect about 7000 pieces of 
information, and it's very expensive to collect each one. So 

40 the filter needs to be reduced further by identifying exactly 
which pieces of information to collect. In the VSA user 
interface, the user can, if desired, be constrained to select 
PerfMon events a certain way, so they can't select them in 
complex Boolean expressions. When the filter makes its way 

45 through the network to the right creator, those PerfMon 
expressions are specifically referenced to the filter and 
collect exactly those expressions. 

That combination of constraint in the VSA user interface 
and appropriate analysis of the results means that the VSA 

50 collects only those things specifically asked for in the 
dynamic case. This is important because every time a 
dynamic event is timed, one event can be fired every half 
second or every second, meaning a lot of events are fired. 
This can overwhelm the system infrastructure. So a filter 

55 reduction system is applied to the events that are initiated by 
the application. And extra reduction can be applied to events 
which are initiated by PerfMon. This could also be done for 
events at the IEC if desired. 

60 Filter Combination 

FIG. 9 illustrates a process of filter combination as used 
within an exemplary embodiment of the invention. It is 
possible, and likely, that multiple users will be monitoring 
applications running on shared servers. When this occurs, 

65 multiple filters can be issued to the same LEC. To ensure the 
most efficient collection, the LEC can combine all of the 
filters prior to performing the entity/process reduction. 
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With reference to FIG. 9, a first user generates user filter behavior. Finally, the user can specify very sophisticated 

1 in box 231, while a second user generates user filter 2 in filter queries by entering the filter directly as text in text 

box 232. These filters are combined by the LEC into a window 260. 

merged or combined filter 235, which in turn applies a filter The tree-oriented part of the user interface allows highly 
for process Alk box 236, a filter for process A2 in box 237, s complex filters to be created without a user having to 
and a filter for process A3 in box 238. The filters are reduced understand the specific syntax or functionality. The system 
after they have been combined. takes advantage of the fact that users have built-in under- 
Appropriate IECs and DECs then monitor and collect standing about the "rational" Boolean operators that are used 
events in accordance with the combined filter. One or more to combine clauses ("OR" for bindings of the same variable, 
LECs, depending upon whether the items being monitored io "AND" for bindings of independent variables). The same 
are on one or multiple machines, collect events from the filter mechanism and user interface are used to both specify 
IECs and DECs, in accordance with the combined filter, and what to analyze and to refine the data which has been 
send them to their respective requesting users, who may be collected and which is presented to the user. VSA 100 
on a single control station or at multiple control stations. analyzes data both as events are collected as well as after 
FIG. 10 illustrates another process of filter combination as 15 they have been collected. That is, users can filter already 
used within an exemplary embodiment of the invention. collected data, in a "post mortem" fashion, to create analysis 
With reference to FIG. 10, filters for processes B1-B3 in reports of specific elements of the data without having to 
boxes 246-248, respectively, are combined in LEC 245 and recollect the data. 

passed on to users 1 and 2 in boxes 241 and 242, respec- The user can additionally specify debug and/or trace 

tively. 20 switches. These are run-time switches. They have a filter to 

When events are collected by the LEC 245 from different determine the appropriate targets. Components, for example, 

sources within the data processing system under can access the name/value pairs using the same interface as 

examination, it determines which clients are interested and the IsActive and FireEvent status conditions, 

routes the events to the respective clients who specified that Thus a user can chose which events to monitor. Boolean 

the events be monitored. Because of the efficient and flexible 25 operators can be applied both within the windows and 

nature of the filters, and the general-case nature of the between the windows. Generally OR's are used within the 

reduction process described above, monitoring and collec- windows, while ANTJ's are used between the windows. In 

tion from multiple machines imposes no extra performance addition, the UI can enable the user to chose from a 

overhead. Performance is simply as if all the monitoring pre-defined list of the "top N" filters or queries, so that the 

were happening from a single machine. 30 user can quickly select from the top N. 

Filter Specification Location of APIs 

FIG. 11 illustrates a screen print of an exemplary user FIG. 12 illustrates a system level overview of an exem- 

interface for specifying a filter. The VSA provides a large 3J plary embodiment showing where APIs of the present inven- 

number of events that can be monitored. Consequently, an tion can appear within the software architecture of a dis- 

efficient mechanism is provided for the user to specify tributed computing system. 

desired event data. The user interface (UI) of the invention In a generalized and slightly over-simplified manner, the 

provides a quick, easy graphical way for the user to specify software architectures for two separate data processing 

the desired queries. 40 system 301 and 302 are illustrated. Systems 301 and 302 

In the graphical UI, users are presented with three trees, each comprise a plurality of applications, represented by 310 

each appearing in a separate window 250, 252, 254, that and 340, respectively. Systems 301 and 302 additionally 

represents the key information: a Machines/Processes win- each comprise software referred to as "middleware" identi- 

dow 250, a Components window 252, and a Categories/ Bed by reference numbers 320 and 350, respectively, and 

Events window 254. The Machines/Processes window 250 45 they each comprise operating system software 330 and 360, 

presents all of the machines being monitored and the pro- respectively. The above-described software executes in the 

cesses on the machines. The Components window 252 processors) of data processing systems 301 and 302, the 

presents the registered VSA data sources on the machines application programs running under the control of their 

being monitored. The Categories/Events window 254 iden- corresponding operating systems. 

tifies all of the registered VSA events that can be monitored. 50 It will be understood that applications 310, 340, middle- 

These can be organized hierarchically in a pre-defined ware 320, 350, and the operating system software 330, 360 

structure, but the user can tailor it to his or her own structure can be entirely local to the data processing system 301 or 

and define his or her own events to be monitored. 302, or they can be distributed among data processing 

It will be apparent to one of ordinary skill in the art that systems 301, 302, and additional data processing systems 

process threads could constitute another level of filter sped- 55 ( not shown but implied by busses 322 and 342). 

fication. Systems 301 and 302 can communicate with each other 

Event sources are required to pre-register which events over bus 332. Systems 301 and 302 can communicate with 

they can emit when they are installed, and this information other systems (not shown) over busses 322 and 352, respec- 

is transmitted at startup from the LEC to the central tively. 

machine. By selecting the "Collect" tab 256, the user can 60 Each system 301 and 302 comprises APIs located in either 

quickly select the desired information to analyze. More the middleware or the operating system or in both. In a 

complex queries can be generated by creating groups of currently implemented embodiment, APIs are located in 

selections using the "OR" tab 258. As the user makes both. In order to facilitate utilization of the performance 

selections, a textual representation of the query, appearing in analysis tools of the present invention by software 

text window 260, is dynamically generated in synchronism 65 developers, APIs are provided to give a wide variety of 

with the graphical depiction in windows 250, 252, and 254, functions, in the form of software modules and components, 

so the user can verify his or her selection, and understand its in common to a broad spectrum of applications. Any one 
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application typically uses only a small subset of the avail- 
able APIs. Providing a wide variety of APIs frees application 
developers from having to write code that would have to be 
potentially duplicated in each application. 

The APIs of the present invention offer the application 5 
developer ready access to the built-in performance analysis 
functions appearing in the middleware and operating system 
portions of the software architecture. 

In the next section, various APIs are presented which 
allow applications to interface with various modules and 10 
components of the networking and operating system envi- 
ronment in order to implement the performance monitoring 
and analysis features of the invention. 

Exemplary APIs and their Functions 15 

This section presents and describes exemplary APIs relat- 
ing to the performance monitoring and analysis features of 
the invention. It will be understood that these APIs are 
embodied on a computer-readable medium for execution on 20 
a computer in conjunction with an operating system or with 
middleware that interfaces with an application program 
having one or more event-generating components. 

The APIs will first be described in functional terms. One 
or more applications, e.g. applications identified generally 25 
by reference number 310 or 340 in FIG. 12 are assumed to 
be running under the control of an operating system, e.g. 
operating system 330 or 360. With respect to any one 
application program, in particular, the application can have 
any of a number of event-generating components. The 30 
application program utilizes APIs (such as APIs 325 or 355 
located in middleware 320 or 350, respectively, or APIs 335 
or 365 located within operating systems 330 or 360, 
respectively) associated with the event-generating compo- 
nent which operate to receive data from the operating system 35 
and to send data to the operating system. 

This set of APIs includes a first interface that enables the 
operating system to set or disable a status condition 
("IsActive") in the application, and it further includes a 



second interface that receives a status query from the oper- 
ating system and that returns the status (True or False) of the 
status condition to the operating system. 

The set of APIs includes an interface that enables the 
operating system to read any one or more of several fields in 
the application. These fields include arguments, causality 
i.d., correlation i.d., dynamic event data, exception, return 
value, security i.d., source component, source handle, source 
machine, source process, source process name, source 
session, source thread, target component, target handle, 
target machine, target process, target process name, target 
session, and target thread. 

Now from the point of view of an operating system, 
consider that an operating system can have an event- 
registering or event-collecting component. The APIs also 
include an interface that enables the operating system to 
query whether a status condition ("IsActive") is set or 
disabled in the application, and they further include an 
interface that returns data to the operating system only if the 
status condition is set. 

The APIs detailed below are described in terms of the 
C/C++ programming language. However, the invention is 
not so limited, and the APIs can be defined and implemented 
in any programming language, as those of ordinary skill in 
the art will recognize. Furthermore, the names given to the 
API functions and parameters are meant to be descriptive of 
their function. However, other names or identifiers could be 
associated with the functions and parameters, as will be 
apparent to those of ordinary skill in the art. 

Four sets of APIs are presented: APIs for generating 
events (C interface), APIs for generating events (automation 
binding), APIs for registering events and sources (C 
binding), and APIs for registering events and sources 
(automation binding). 

APIs for generating events used by applications that 
interface with the performance analysis functions of the 
present invention are presented below, both for C interface 
and for automation binding. 



APIs for Generating Event (C Interface): 



HRESULT IsActive( 

); 

typedef [vl_enurn] enum VSAParameterType { 



cVSAParameterKeyString=.Ox80000000, 
cVSAParameterValueMask-Qx0007ffff, 
cVSAParameterValueiypeMask-0x00070000, 
cVSAParameterValueUnicodeString-OxOOOOO, 
cVSAParameterValueANSIString=0xl0000, 
cVSAParameterValueGUID=0x2000D, 
cVSAPaiameterValucDWORD=0x3000, 
cVSAParametervaiueBYTEArray=0x40000, 
cVSAParameterValueLengthMask=Oxffff, 
} VSAParameterFlags; 

typedef [vLemun] enum VSAStandardParameter { 



neterSourceThread-2, 
neterSourceCompcment=. 
neterSourceSession=4, 
cVSAStandardParameterTargetMaohine=5, 
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cVSAStandardParameterTargetTliread-7, 
cVSAStandardFarameterTargetComponent=8 ; 
cVSAStandardParameterTargetSession=9, 
cVSAStondardParameterSecurityIdentify=.10, 
cVSAStandardParameterCaiisalityID=:ll, 
cVSAStandardPa 



cVSAStandardParameterTargetProcessName=13, 
cVSAStandardParameterDefaultLast=13, 
cVSAStandardParameterNoDefault=0x4000, 
cVSAStandardParameterSourceHandle=0x400a, 
cVSAStandardParameterTargetHandle=Ox4Q01 J 
cVSAStandardParameterArguments=Ox4002 ; 

cVSAStandardParameterException-0x4u04, 
cVSAStandardParameterCorrelationID-Qx4005, 
cVSAStandardParameterDynamicEventData=0x4006, 
cVSAStendardParameterNoDefaultLast-0x4006 

} VSAStandardParameters; 

typedef [vl_enum] enum e VSAEventFIags { 
cVSAEventStandard=0, 
cVSAEventDefaultSource=l, 
cVSAEventDefaultTarget=2, 
cVSAEventForceSend=8 

)- VSAfivenlMags; 

HRESULT FireEvent( 

in] REFGUID guidEvent, 
in] int nEntries, 

in, size„is(nEntries)] LPDWORD rgKeys, 
in, size_is(nEntries)] LPDWORD rg Values, 
in, size__is(nEntries)] LPDWORD rgTypes, 
in] DWORD dwTimeLow, 
in] LONG dwTimeHigh, 
in] VSAEventFIags dwFlags 



"BeginSession" is called by an entity before it fires events 35 of the present invention are presented below, both for C 



to register its entity and instance names (source and 
session). 

"EndSession" is called by an entity after it completes 

"IsActive" is called by an entity which is considering 
firing events and wishes to know if anyone is listening. 
"FireEvent" fires an actual event from an entity. 



interface and for automation binding. 



id Sources (C Interface): 



HRESULT EndSession( 



); 



HRESULT FiieEvent( 

[in] BSTR guidEvent, 
[in] VARIANT rgKeys, 
[in] VARIANT rgValues, 
[in] long rgCount, 
[in] VSAEventFIags dwFlags 

); 



for the above set of "APIs For Generating 
ime as for the C Interface APIs preceding 



APIs for registering events and sources used by applica- 
tions that interface with the performance analysis functions 



HRESULT RegisterCustomEvent( 
in] REFGUID guidSourcelD, 
in] REFGUID guidEventlD. 
in] LPCOLESTR strVisibleName, 
in] LPCOLESTR strDescription, 
in] long nEventType, 
in] REFGUID gnid Category, 
in] LPCOLESTR strlconFile, 

[in] long nlcon 

); 

HRESULT RegisterEventCategory( 
[in] REFGUID guidSourcelD, 
[in] REFGUID guidCategorylD, 
[in] REFGUID guidParentID, 
[in] LPCOLESTR strVisibleName, 
[in] LPCOLESTR strDescription, 
[in] LPCOLESTR strlconFile, 
[in] long nlcon 

); 



HRESULT RegisterDynamicSource( 
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}; 



[in] iJPCOLESTR strVisibleName, 
[in] REFGUID guidSourcelD, 
[in] LPCOLESTR strDescription, 
[in] REFGUID guidClsid, 
[in] long inproc); 

HRESULT UnRegisterDynamicSource( 
[in] REFGUID guidSourcelD); 

HRESULT IsDynamicSourceRegistered( 
[in] REFGUID gu"" 



APIs for Registering Ei 



id Sources (Autorm 



[in] BSTR strDescription, 
[in] BSTR guidClsid, 
[in] long inproc); 
HRESULT UnRegisterDynamicSource( 

[in] BSTR guidSourcelD); 
HRESULT IsDynamicSourceRegistered( 
[in] BSTR guidSourcelD, 
[out] VARIANT_BOOL *boolRegistered); 



"RegisterSource" is called by code that is installing a new 

event-generating entity on a machine. 
"IsSourceRegistered" detects if an event-generating entity 

is present. 

"RegisterStockEvent" is called by an event-generating 
entity to note its use of a system event. 

"RegisterCustotnEvent" is called by an event-generating 
entity to note its definition of a custom event. 

"RegisterEventCategory" is called by an event-generating 
entity to note its definition of a custom event category. 

"UnRegisterSource" is called by code that is uninstalling 
an event-generating entity. 

"RegisterDynamicSource" is called by code that is install- 
ing a DEC (dynamic event-generating entity). 

"UnRegisterDynamicSource" is called by code that is 
uninstalling a DEC (dynamic event-generating entity). 

"IsDynamicSourceRegistered" detects if an event- 
generating entity is present. 



}; 



.d Sources (Automation B: 



I1RHS1 I : iisl S i ~ 
[in] BSTR sh" " 
[in] BSTR gt 



HRESULT K 
[in]BST _ 
[out] VARIANT JOOL "pblsRegistered 

); 

HRESULT RegisterStockEvent( 



[in] BSTR guidEventID 

); 

HRESULT RegisterCustomEvent( 
[in] BSTR guidSourcelD, 
[in] BSTR guidEventID, 
" " BSTR strVisibleName, 
BSTR strDescription, 
long nEventType, 
BSTR guidCategory, 
BSTR strlconFile, 



HRESULT Regisb 
[in] BSTR gi 
[in] BSTR gi 
[in] BSTR gi 
[in] BSTR 
[in] BSTR 
[in] BSTR 
[in] long nl 



HRESULT UnRegisterSource( 



HRESULT RegisterDynamicSource( 
[in] BSTR strVisibleName, 
[in] BSTR guidSourcelD, 



preceding them. 

The APIs for registering events and sources (C interface/ 
automation binding) can be used by an application to 
register which events can be generated by a data source. 
20 These APIs turn on and off such registration. They also 
specify whether the registration is a pre-defined, standard 
event or a custom event. They can also specify the event 
category, and they can determine whether a source is reg- 
istered or not. 



FIG. 13 illustrates a screen print of an animated applica- 
tion model which the present invention generates to show 
30 the structure and activity of an application whose perfor- 
mance is being studied. An important innovation in the 
VSA's analysis function is its ability to dynamically gener- 
ate diagrams of the functionally active structure of the 
application. 

35 The VSA creates the application diagrams by closely 
examining the event data that is received. As explained 
above, events are correlated by the VSA to understand the 
flow of control. The data design described above makes it 
possible to understand which events need to be correlated 

40 and how they should be grouped and connected. 

Correlation makes use of the source and target informa- 
tion specified in the event data. When insufficient informa- 
tion is present, additional heuristics can be used to extrapo- 
late the event flow. This includes time-ordering, COM 
causality information, and event handles. 

With reference to the screen print 370 of FIG. 13, the 
functional interrelationship among blocks such as blocks 
371 and 372 is visually depicted. (It will be understood by 

50 one of ordinary skill in the art that, while all blocks in FIG. 
13 are depicted with dummy labels, in practice each block 
will bear an appropriate label in accordance with that 
block's function or place within the performance model.) It 
will also be understood by one of ordinary skill that many 

55 other forms of visual portrayal of the application perfor- 
mance model can be used. 

As new diagram elements are identified, they are added to 
the user's screen 370. Frequently sufficient information is 
not available to immediately connect them to other entities 

60 on the diagram. This is the case with blocks 381 and 382 in 
FIG. 13. As data becomes available, the entities are con- 
nected. 

This application model diagram is highly interactive. 
Selections made in other VSA windows can result in selec- 
65 tions in the diagram. Incoming events are directly animated 
into the diagram. Diagram blocks can be expanded or 
collapsed to show more or less detail. 
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To support this interactive behavior, the diagram data 
structures use a network of linked mapping tree data struc- 
tures to efficiently understand the impact of new data, and to 
determine the blocks required to be added or removed when 
more data arrives. 5 

Incomplete information is stored specially, and when 
other incomplete data arrives, there is an attempt to pair up 
the incomplete data using pre-defined heuristics and the data 
design described above. 

Because the internal storage of the diagram only stores 10 
blocks and their connections, it is very space efficient. In 
normal scenarios storage space does not grow very fast 
proportionate to the number of events that have been 
viewed. 

FIG. 14 illustrates various user interface features of an 15 
animated application model in an exemplary embodiment of 
the invention. The user interface features are shown gener- 
ally by reference number 400. In the UI depicted in FIG. 14, 
diagrams are portrayed of the different blocks representing 
varying levels of detail of a hierarchical model of the 
application. 

As shown in FIG. 14, four different types of diagrams are 
available representing varying levels of detail: machines, 
processes, data sources, entities, and instances. Users can 
expand and collapse items on these diagrams to create the ^ 
exact level of detail required. As well, the recorded event 
data can be depicted adjacent to the animated application 
model or overlaid upon it. In addition, using VCR-like 
commands, described below with reference to FIG. 14, users 
can play and replay the application execution, stop, pause, 3Q 
reverse, speed up, slow down, and so forth. 

Merely by way of illustration, an animated application 
model shown generally by reference number 410, includes 
a machine 404, which is shown coupled functionally to a 
machine 412, which in turn is coupled to a machine 411. 35 
Each machine 404, 411, 412 can, in turn, be coupled to other 
items (not shown). 

A visual depiction of a first machine 404 can be 
"exploded" into its constituent processes, depicted by box 
402. The user can further "drill" into a process, such as 40 
Process #1, to explode its constituent entities, depicted by 
box 406. Further, the user can drill into an entity, such as 
Entity #1, for example, to explode a view, depicted by box 
408, showing the various Instances #1 through #N which are 
included in Entity #1. 45 

The drill-in shown in FIG. 14 can be mixed in the same 
user screen. That is, a drill-in for machine 411 could show 
only its constituent processes, and a drill- in for machine 412 
could show only its constituent processes plus the entities 
for one of the processes. So any individual box can be drilled 50 
down or up independently. In addition, the user can perform 
zooming, printing, and any other known screen operations. 

The graphical UI includes a display and a user interface 
selection device, such as a keyboard or mouse. A model of 
the functionally active structure of the data processing ss 
system is displayed. Using the user interface selection 
device, a selection signal is generated with respect to a 
portion of the animated model, along with the user's expan- 
sion or contraction command. The VSA performs an expan- 
sion or contraction function on the selected portion in 60 
response to the selection signal and to the expansion or 
contraction command, and the selected portion is either 
exploded or contracted per the expansion or contraction 
command. 

Behind this visual depiction of the application model, the 65 
VSA maintains a log of all of the events that have been 
collected. 



The VSA utilizes a graphical UI paradigm in the form of 
a video cassette recorder (VCR) having, for example, 
Reverse, Stop, Pause, Speed, and Play commands. Other 
appropriate commands can be provided as indicated by an 
unlabeled button on the control panel. Using the VCR 
paradigm to control the depiction of the application 
performance, the VSA can run through each of the events 
and correspondingly animate the application model shown 
in FIG. 13 or FIG. 14. For example, if the current event is 
between Machine #1 and Machine #N, then a connection 
segment 411 is highlighted. Using the VCR commands, the 
user can change the speed, pause the display, and go 
backward and forward. 

While the user is doing this, a separate, adjacent window 
430 shows the event details. So while the event is occurring, 
and the application model diagram of FIG. 14 is being 
animated, the user can also view other pertinent performance 
details in window 430. 

Also shown in FIG. 14 is an adjacent time line window 
440 having equally spaced vertical lines throughout the time 
duration of an event. A special marker 445 moves from left 
to right through the vertical lines to show the progress of an 
event, either as the event occurs, or as the event is being 
played back by the user. 

All of the windows are time-synchronized to one another. 

Performance Analysis 

FIG. 15 illustrates a representative display of performance 
data in an exemplary embodiment of the invention. 

The VSA provides another important component for 
automatic analysis of collected data, the performance analy- 
sis component. The performance analysis component ana- 
lyzes the collected data and creates a call tree by pairing 
events (e.g. Call and Return) and ordering them using 
temporal ordering and heuristics. The result is a presentation 
of the call tree in a Gantt style view with any Perfmon (or 
other dynamic) data displayed adjacent to or overlying the 
displayed call tree. With this view, the VSA provides a 
mechanism to simultaneously view application and environ- 
ments] performance information and quickly drill into the 
details (by expanding to another level in the call tree). When 
the VSA is used to track and graph load information, the 
VSA provides an innovative way for the user to view how 
applications perform, behave, and degrade under different 
load and stress scenarios. 

Like the animated application model, the call tree is 
generated by the application of suitable pre-determined 
heuristics, since the user does not have any a priori knowl- 
edge of the call relationships of more than two objects. 
Temporal and contextual information, for example, are used 
to deduce a call tree without full information. It will be 
apparent to one of ordinary skill that other kinds of infor- 
mation can also be used to deduce a call tree. 

With reference to FIG. 15, an upper window 450 includes 
a process summary portion 460 and a performance summary 
portion 470. The process summary portion 460 comprises a 
Call Hierarchy including Call, Enter, Leave, and Return 
events. Each of these events can contain sublevels, as shown 
for the Call event. It will be understood that the sublevels 
can be further subdivided to whatever degree is required, as 
shown for the Leave event. The user can expand or collapse 
the levels of detail for each of the events, as desired. 

Each of the Call, Enter, Leave, and Return events can 
have a corresponding Gantt type of representation, as illus- 
trated in performance summary portion 470, showing the 
duration of the event. For example, Gantt segment 471 
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represents the duration of the Call event. The duration of the shows one graph line 505, more can be shown. Window 506 

Enter, Leave, and Return events are shown by Gantt seg- provides an indication of the source machines, maximum, 

ments 472, 473, and 474, respectively. minimum, average, and current value for each graph line 

Performance summary portion 470 thus provides a shown in window 504. 

GANTT-style presentation of the call tree, i.e. who calls 5 . , . . _ . 

whom. The GANTT bars 471-^74 show when it started and Additional loots 

how long the Call lasted. This information comes from the The VSA provides a few other tools which, when used in 

EEC. conjunction with the features described above, provide addi- 

Beneath the call tree performance summary, a graph 480 tional insight into application performance, 
can be depicted to show, for example, the CPU utilization 10 FIG. 17 illustrates a screen print 520 of a timeline display 
during the Call operation such as an RPC. Graph 480, which of performance data. The timeline window presents a visual 
may be positioned adjacent to or overlaying the Gantt representation of the timing of all related events. Dark 
segments 471^474, could also illustrate any one or more clumps 522 represent tight groupings of events, while spaces 
other desired aspects of the system performance besides the 524 represent possible under utilization of resources. Time- 
CPU utilization. The Gantt chart can be based upon the 15 line 520 can be annotated to present event activity per 
application events. The graph can be selected from the time machine or per process (or other system resource) using 
base. different colors. This allows users to visually identify both 

Also shown in FIG. 15 is a summary window 490 which potential system-wide and per-machine bottlenecks. As 
provides a distillation of what is shown in the performance playback or monitoring continues, the timeline 520 acts as 
windows 410 and 430 of FIG. 14 and in the upper window 20 a real-time indicator of the current system context. 
450 of FIG. 15. For example, if the time slice between FIG. 18 illustrates a screen print 530 of summary display 
dashed lines 481 and 482 is selected for scrutiny, a summary of performance data. Similar to previously described sum- 
performance graph 492 is generated for the selected time mary window 490 in FIG. 15, but depicting different 
segment. Summary window 490 also contains a textual information, the summary information in screen print 530 
description of the application's performance during the 25 presents a distillation of all events selected by the VSA user, 
specified time segment. That is, if multiple events are selected, the unique elements 

Thus the user can view a tightly synchronized, easily (eg- source and target machines, processes, entities, etc.) are 
comprehensible graphical and textual analysis and represen- displayed. This is very useful when a time range is selected 
tation of the application performance, in the form of the „ either in the timeline or performance viewer. The summary 
animated block diagram 410, the Event Detail window 430, window allows the user to see a quick tally of what is going 
and the Time Line window 440 of FIG. 14, as well as the on in the application. This is a particularly important view 
process summary portion 460 and the performance summary because of the large volumes of data generated while mom- 
portion 470 of FIG. 15. The summary window 490 ties toring a system, 
everything together. Again, everything is time- 35 ^ nchronization 
synchronized. 

In addition, all of the above windows can be operated to FIG. 19 illustrates a screen print 550 of several synchro- 
display the application performance in real time as well as nized sets of performance data. Screen 550 comprises sev- 
"post mortem". This applies as well to the animated appli- eral windows, including an animated application model or 
cation models, as shown in the screen print of FIG. 13 and 4Q process diagram 552, an event log window 554, CPU 
in window 410 of FIG. 14, so that in real time as an performance view window 556, event viewing window 558, 
application is being analyzed, one block will appear, then a summary window 560, and a time line window 562. 
another, and then the interconnection between the two The VSA ensures that all information presented to the user 
blocks. Blocks are dynamically added, removed, and moved, is cross-correlated. This provides instant synchronization, 
and the interconnections between them are dynamically 45 When the user selects an item (or set of items) in one 
changed to reflect changing conditions in the execution of window, all other windows can (based on user preference) 
the application. The diagram is kept up to date with what is automatically highlight the selection. This includes the 
really happening. selection of specific events, selection of all events in a 

FIG. 16 illustrates a screen print 500 of an exemplary specified time range, or selection of all events associated 
display of performance data. Screen print 500 depicts the 50 with a specified entity. However, if the user desires, auto- 
percentage of CPU utilization for a selected group of pro- synchronization can be turned off for any one or more 
cessors. Window 504 shows a graph line 505 which, for windows. 

example, depicts the percentage of CPU utilization (right- FIG. 19 illustrates this concept. Here, for example, the 

hand side) versus time (bottom side). In general, graph lines user made a time selection in the performance view window 

represent overlaid DEC data. 55 556 (representing PerfMon data) over a period of time where 

Window 502 depicts a list of events relating to the CPU behavior was in question. The animated application 

operation of the processors under scrutiny. model or process diagram 552 highlights the entities/ 

Window 506 depicts a legend or key to the information processes involved in the selection. The event log window 

shown in window 504. Window 506 indicates the source 554 highlights all events in the specified time range, part of 

machines (all) as well as summary performance information 60 which represent a call tree. The event viewing window 558 

(a minimum of 13 processors, a maximum of 100 presents data on a single event (for multi-event selections it 

processors, and an average of 49 processors executing highlights the first event). The timeline window 562 high- 

simultaneously; currently 35 processors concurrently lights the specified time range as well as shows performance 

executing). Window 506 also comprises a "legend" 507 peaks, and the summary window 560 tallies the events in the 

which provides a color key 508 to assist the user in identi- 65 time range and presents a summary, 

fying graph lines in window 504, such as Gantt bars 510, Thus, while displaying the animated functional model 

511, and 512, or graph line 505. While window 504 only 552, the control station can also simultaneously display 
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items such as summary data 560, time data 562, event details 
558, and/or an event log or call tree 554. 

Window synchronization avoids a common problem with 
systems based on multiple windows. In a typical multi- 
window system, the user wants to have one or two windows 
fully visible, while others are invisible. Typically no context 
flows to or from invisible elements, despite the fact that the 
user may want this to happen. The VSA avoids this problem 
by creating a user notion of a shared selection (the 
'AutoSelection'), and allows the user to subscribe windows 
to that selection. As a result, the user is not confused by the 
flow of context, and instead they find it predictable and 
natural. 

The system level overview of the operation of an exem- 
plary embodiment of the invention has been described in the 
Detailed Description. As described, the method and appa- 
ratus for analyzing the performance of a data processing 
system and, in particular, to an application running on a 
distributed data processing system, enable users to quickly 
and easily observe the operational performance of such a 
system without significantly impacting such performance. 

Methods of Exemplary Embodiments of the 
Invention 

The previous sections have described the structure and 
operation of various exemplary embodiments of the inven- 
tion. In this section, the particular methods performed by 
such exemplary embodiments are described by reference to 
a series of flowcharts. These methods constitute computer 
programs made up of computer-executable instructions. 
Describing the methods by reference to flowcharts enables 
one skilled in the art to develop such programs including 
such instructions to carry out the methods on suitable 
computing systems (the processor of the computing systems 
executing the instructions from computer-readable media). 

FIGS. 19-27 are flowcharts of methods to be performed 
according to exemplary embodiments of the invention. It 
will be understood by one of ordinary skill that the steps 
depicted in these flowcharts need not necessarily be per- 
formed in the order shown. It will also be understood that 
while the flowcharts have "Start" and "End" blocks, in 
general the processes they depict are continuously per- 
formed. 

FIG. 20 A-C is a flowchart illustrating, in steps 601 
through 612, overall data collection architecture and how 
data is collected via the IECs, DECs, and LECs. The process 
begins with block 601. In block 602 the operating system or 
middleware creates an IEC reference. In the next block 603, 
the control station 100 creates an LEC. 

Block 604 depicts that the LEC converts the IEC refer- 
ence to an IEC. In block 605 the LEC is indicated as being 
capable, for example, of turning the IEC on or off by 
enabling or disabling its IsActive status condition. 

In block 606 the control station 100 can turn a DEC on or 
off. 

In block 607 an IEC collects events generated by a data 
source within the data processing system under scrutiny. The 
term "collect" herein broadly includes the IECs function of 
creating events in response to certain conditions occurring 
within the process space it is monitoring. 

In block 608 the LEC collects events from the IEC and 
sends them to the control station 100. 

In block 609 the DEC collects events that are generated on 
a time basis. The term "collect" herein broadly includes the 
DECs function of creating events in response to monitoring 
certain time-valued system functions. 
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In block 610 the LEC collects data from the DEC and 
sends it to the control station 100. Block 611 indicates that 
the LEC buffers a predetermined quantity of data and only 
stores the data on request of the control station 100. The 

5 process ends in block 612. 

FIG. 21 A-B is a flowchart illustrating, in steps 615 
through 625, an exemplary embodiment of overall data 
design and how the VSA determines and maps relationships 
between entities. The process starts with block 615. Next in 

10 block 616 events are identified by one or more pre-defined 
event fields and/or custom event fields. In block 617 events 
that are generated as a result of interactions among entities 
in the data processing system under scrutiny are collected. In 
block 618 an IEC monitors events and sends them to an 

15 LEC. In block 619 a DEC monitors time-based events and 
sends them to an LEC. In block 620 an LEC collects events 
and sends them to the control station. Next in block 621 the 
VSA analyzes the events and their event fields, and in block 
622 the VSA determines the relationships among the 

20 entities, as described earlier. In block 623 the VSA maps the 
relationship among the entities, based in part on the content 
of the event fields. In block 624 the VSA generates a 
functional block diagram of the relationship among entities, 
and the process ends in block 625. 

15 FIG. 22 A-B is a flowchart illustrating, in steps 630 
through 639, an exemplary embodiment of triggers. The 
method starts in block 630. In block 631 a control station 
specifies one or more trigger conditions, and it can specify, 
if desired, a Boolean relationship between two or more 

30 trigger conditions. The control station can also specify 
filters, for example a first filter and a second filter. The 
second filter can be more detailed and comprehensive than 
the first filter. The control station can also specify a reset 
condition. It can also specify how many events the LEC 

35 should store in its circular buffer store. 

In block 632 an LEC collects events in accordance with 
the first filter while watching for a trigger condition, and in 
block 633 the LECs buffer store stores events collected by 

^ the LEC. In block 634, when the LEC detects a trigger 
condition, it sends the stored events to the control station, 
and in block 635 the LEC begins collecting events in 
accordance with the second filter and sending them to the 
control station. In block 636 the LEC watches for a reset 

4S condition. In block 637, if the LEC detects a reset condition, 
it stops sending events to the control station, and in block 
638 the LEC reverts to collecting events in accordance with 
the first filter and watching for another trigger condition. The 
process ends in block 639. 

50 FIG. 23 A — B is a flowchart illustrating, in steps 645 
through 653, an exemplary embodiment of filter reduction. 
The process begins in block 645. In blocks 646-648, a user 
specifies a filter, which process can take the form of a series 
of iterations of blocks 646-648. In block 646 a menu or 

55 graphical user interface is displayed which lists one or more 
items representing machines, components, IECs, DECs, 
processes, events, and threads within the data processing 
system under examination. The user can chose a filter in the 
form of a Boolean expression comprising two or more items. 

60 In block 647, the user selects his or her choice by generating 
a suitable menu entry selection signal using, for example, a 
mouse or keyboard. Block 648 indicates that step 647 is 
repealed, as necessary, until all desired filter items have been 
selected by the user. 

65 Next in block 649 the filter is either sent to one or more 
specific machines, processes, IECs, DECs, events, or 
threads, or it is broadcast generally throughout the data 
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processing system. In block 650 the filter is applied to one 
or more specific machines, processes, IEC, DECs, events, 
and/or threads, in accordance with its user-selected vari- 
ables. In block 651 an EEC and a DEC collect events in 
accordance with the filter. In block 652 the LEC collects 5 
events from the IEC and the DEC in accordance with the 
filter, and the LEC sends the collected events to a control 
station. The process ends in block 653. 

FIG. 24 A-B is a flowchart illustrating, in steps 660 
through 668, an exemplary embodiment of filter combina- 1° 
tion. The process begins in block 660. In block 661, one or 
more control stations specify more than one filter. Each filter 
designates one or more machines, processes, IECs, DECs, 
events, and/or threads. In block 662 the filters are sent to one 
or more LECs, each of which combines the filters it receives is 
into a respective combined filter. Each combined filter 
applies to specific machines, processes, IECs, DECs, events, 
and/or threads. In block 663 an IEC collects events gener- 
ated by a first data source within the data processing system 
under examination. In block 664 a DEC collects events that 20 
are generated on a time basis by a second data source within 
the data processing system under examination. In block 665 
the IEC and DEC each collect events in accordance with a 
combined filter. 

In block 666 the LEC collects events from the IEC and 25 
from the DEC in accordance with a combined filter, and the 
LEC sends the events to the control station or control 
stations which specified that the events be monitored. In 
block 667 the control station analyzes the events. The 
process ends in block 668. 30 

FIG. 25 A-B is a flowchart illustrating, in steps 670 
through 680, an exemplary embodiment of a user interface 
for specifying one or more filters. The process begins in 
block 670. In block 671 a control station provides a graphi- 
cal user interface (UI) to a user for enabling the user to 
specify at least one filter. In block 672 a menu is displayed 
listing items representing event-generating machines, event- 
generating components, and/or categories of events with the 
data processing system under examination. 

In block 673 the VSA receives a menu entry selection 
signal indicative of a user interface selection device select- 
ing one of the items to monitor. Block 674 indicates that step 
673 is repeated, as necessary, until all desired items have 
been selected. 45 

Block 675 indicates an alternate step to step 672, in that 
the UI displays a pre-defined list of filters from which a user 
can specify at least one filter. The pre-defined list can be a 
"top 10" of the most popular filters in use, and it can be 
updated automatically by the VSA. Here the user has only to 50 
click on one filter, and it automatically includes a set of the 
items displayed in block 672. 

In block 676 a textual representation of the user-selected 
filter is displayed in a window. In addition, a window is 
provided in which the user can enter the filter directly in text 55 
format. In block 677 an IEC and a DEC each collect events 
in accordance with the user-selected filter. In block 678 an 
LEC collects events from the IEC and from the DEC, in 
accordance with the filter, and the LEC sends the events to 
the control station. In block 679 the control station either so 
analyzes events collected by the LEC as the events are 
collected, or the LEC analyzes the events after the events 
have been collected (in post mortem fashion). The process 
ends in block 680. 

FIG. 26 A-C is a flowchart illustrating, in steps 690 65 
through 700, an exemplary embodiment of automatic gen- 
eration of an animated application model. The process 



begins in block 690. In block 691 an IEC collects e 
generated by a first data source within a data p 
system under examination. In block 692 a DEC collects 
events that are generated on a time basis by a second data 
source within the data processing system under examination. 

In block 693 an LEC collects events from the IEC and 
from the DEC and sends them to the control station. In block 
694 the control station analyzes the events and displays a 
model of the functionally active structure of the data pro- 
cessing system under examination. While displaying the 
animated functional model, the control station can also 
simultaneously display items such as summary data, time 
data, event details, and/or a call tree. In block 695 the control 
station keeps updating the animated model in real time as it 
receives and analyzes events. 

In block 696 the control station presents a user interface 
(UI) to the user in the form of a display, a user interface 
selection device, and uses a video cassette recorder (VCR) 
paradigm to enable the user to analyze the performance of 
the data processing system. The UI displays user-selectable 
commands, such as Play, Replay, Slop, Reverse, Pause, and 
Change Speed of the animated model. In block 697 the UI 
also enables the user to select one or more portions of the 
model and to either explode or enlarge a selected portion of 
the model to show more detail, or to contract or shrink a 
selected portion of the model to show less detail. 

In block 699 the control station displays the active por- 
tions of the animated model in a visually distinctive manner, 
for example by highlighting them. The process ends with 
block 700. 

FIG. 27 A-C is a flowchart illustrating, in steps 710 
through 720, an exemplary embodiment of a user interface 
for displaying the performance analysis of the system under 
examination. The process begins in step 710. In block 711 
the control station analyzes events, for example events 
received from an LEC. In block 712 the control station 
displays a call tree of the functionally active structure of the 
data processing system under examination. In block 713 the 
control station can, while continuing to display the call tree, 
display time-synchronized items such as Gantt type charts, 
process summary data, performance summary data, and/or 
time data. In block 714 the control station updates the call 
tree in real time while it continues to receive events and 
analyze them. 

In block 715 the user interface enables the user to select 
one or more portions of the call tree to analyze more closely. 
In blocks 716 and 717, the UI enables the user to either 
explode or enlarge a selected portion of the model to show 
more detail, or to contract or shrink a selected portion of the 
model to show less detail. In block 718 the control station 
uses heuristics such as time-ordering, causality information, 
and event handles to generate and display the call tree. In 
block 719 the control station displays active portions of the 
animated model in a visually distinctive manner, for 
example by highlighting them, displaying them in a different 
color, or "flashing" them. The process ends in block 720. 

The particular methods performed by the significant 
exemplary embodiments of the invention have now been 
described with reference to the flowcharts of FIGS. 19-26. 

Conclusion 

A method and apparatus for analyzing the performance of 
a data processing system have been described which over- 
come many of the disadvantages of prior known systems. 
The VSA collects application performance data by use of 
instrumentation within the application t 
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using an efficient, distributed collection architecture. By 
instrumenting the core application platform, the VSA can 
obtain information about the application without having to 
make changes to it. 

The VSA enables the user to view an animated model of 
the application as it is running, as a set of interconnected 
black boxes. It does so without re-architecting or recompil- 
ing the original code. 

The VSA includes an efficient mechanism for collecting 10 
and transmitting the data to a central log, and for streaming 
it to disk. A user interface is provided for detailed and 
specific selection of what to analyze, and the system is 
automatically configured to minimize impact based on the J5 
selection criteria. This information is distributed across the 
monitored systems and is used to efficiently collect analysis 
data. 

In addition, the user is provided with automatic analysis 
tools to filter and view the operation of the application and 20 
to locate performance issues. A user display provides over- 
lay and lime-synchronized system performance data in any 
of a wide variety of user-specified formats. The VSA can be 
used for both live and post-mortem analysis. 2S 

As a consequence, this invention provides software 
developers, including developers of distributed component- 
based systems, with the ability to understand and analyze the 
behavior of their software while it is executing. The VSA 
can help find performance bottlenecks, understand system 30 
structure, and isolate behavioral problems. 

Although specific embodiments have been illustrated and 
described herein, it will be appreciated by those of ordinary 
skill in the art that any arrangement which is calculated to 35 
achieve the same purpose may be substituted for the specific 
embodiments shown. This application is intended to cover 
any adaptations or variations of the present invention. 

It will be apparent to those of ordinary skill that the ^ 
collection aspects of the invention can be implemented 
either in the operating system or in middleware. 
Furthermore, the implementation can be implemented in any 
desirable manner, e.g. by splitting it into separate pieces 
such as filter-specifying, event-firing, data collection, and ^ 
analysis/presentation. For example, by including one or 
more pieces in the operating system, the potential utilization 
of the invention can be widespread. 

For example, those of ordinary skill within the art will 
appreciate that in one embodiment a virtual-machine style 50 
system (e.g. a Java system) could automatically insert the 
implementing features of this invention into all programs at 
the virtual machine level. 

Alternatively, a hardware-based system could automati- ^ 
cally generate out-of-band signals at the hardware level in 
accordance with the concepts disclosed herein. 

In addition, a data-bound system (e.g. an Oracle database) 
could use data triggers to get similar results. 

Finally, as future operating systems are developed, the 60 
innovations herein could be applied to an agent-based oper- 
ating system that is able to automatically migrate to different 
machines. 

Therefore, it is manifestly intended that this invention be 65 
limited only by the following claims and equivalents 
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APPENDIX I 

ADDITIONAL DESCRIPTION 

The following material is not to be construed as 
pending claims 

Data Design 

1. A system for analyzing and mapping relationships among 
entities in a data processing system in which events are 
generated as a result of interactions among the entities 
comprising: 

an event concentrator that collects the events; and 
a control station coupled to the event concentrator and 
receiving events therefrom, the control station analyz- 
ing the events and mapping relationships among the 
entities. 

2. A system as recited in claim 1 and further comprising: 
an in-process event creator that monitors events and sends 

them to the event concentrator. 

3. A system as recited in claim 1 and further comprising: 

a dynamic event creator that monitors time-based events 
and sends them to the event concentrator. 

4. A system as recited in claim 1 in which an event is 
identified by one or more event fields, and in which the 
control station analyzes events and maps relationships 
among entities based in part on the content of the event 
fields. 

5. A system as recited in claim 3 in which an event field is 
from a group comprising arguments, unique i.d., causality 
i.d., correlation i.d., dynamic event data, exception, return 
value, security i.d., source component, source handle, 
source machine, source process, source process name, 
source session, source thread, target component, target 
handle, target machine, target process, target process 
name, target session, anil target thread. 

6. A system as recited in claim 5 in which an event field 
comprises one or more default event fields. 

7. A system as recited in claim 4 in which an event field is 
a custom field. 

8. A system as recited in claim 4 in which the events 
comprise a call event and an enter event between two 
entities, one being a source entity that performs the call 
event and the other being a target entity that performs the 
enter event, and in which the event fields comprise a 
unique i.d., causality i.d., and correlation i.d. 

9. A system as recited in claim 8 in which the event fields 
further comprise source component, source machine, 
source process, and source session for the source entity. 

10. A system as recited in claim 8 in which the event fields 
further comprise target component, target machine, target 
process, and target session for the target entity. 

11. Asystem as recited in claim 1 in which the control station 
maps the relationship among the entities in the form of a 
functional block diagram. 

12. A system as recited in claim 1 in which the data 
processing system is a distributed system comprising a 
plurality of machines, and entities reside on different 
machines. 

13. In a data processing system comprising a plurality of 
entities, a method comprising the steps of: 
collecting events that are generated as a result of inter- 
actions among the entities; 

analyzing the events; and 

determining the relationship among the entities. 

14. The method recited in claim 13 and further including the 
step of: 
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mapping the relationship among the entities. 

15. The method recited in claim 14 wherein the step of 
mapping the relationship among the entities comprises 
generating a functional block diagram of such relation- 

16. The method recited in claim 13, wherein the steps recited 
therein can be performed in any suitable order. 

17. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 13. lQ 

18. A method for analyzing and mapping relationships 
among entities in a data processing system in which 
events are generated as a result of interactions among the 
entities, and in which events are identified by one or more 
events fields, the method comprising the steps of: 
collecting the events; 

analyzing the events fields; and 

mapping relationships among the entities based in part on 
the content of the event fields. 

19. The method recited in claim 18, wherein the steps recited 2Q 
therein can be performed in any suitable order. 

20. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 18. 

21. In a data processing system comprising a plurality of 2S 
entities, an event concentrator, and a control station 
coupled to the event concentrator, the method comprising 
the steps of: 

the event concentrator collecting events that are generated 
as a result of interactions among the entities; and 30 

the control station analyzing the collected events and 
determining the relationship among the entities. 

22. 'ITie method recited in claim 21 and further including the 
step of: 

the control station mapping the relationship among the 35 
entities. 

23. The method recited in claim 22 wherein the step of 
mapping the relationship among the entities comprises 
generating a functional block diagram of such relation- 

24. The method recited in claim 21 in which an event is 
identified by one or more event fields, and in which the 
control station analyzes events and maps relationships 
among entities based in part on the content of the event 
fields. 45 

25. The method recited in claim 24 in which an event field 
is from a group comprising arguments, unique i.d., cau- 
sality id., correlation i.d., dynamic event data, exception, 
return value, security i.d., source component, source 
handle, source machine, source process, source process 50 
name, source session, source thread, target component, 
target handle, target machine, target process, target pro- 
cess name, target session, and target thread. 

26. The method recited in claim 24 in which an event field 
comprises one or more default event fields. 55 

27. The method recited in claim 24 in which an event field 
is a custom field. 

28. The method recited in claim 24 in which the events 
comprise a call event and an enter event between two 
entities, one being a source entity that performs the call 60 
event and the other being a target entity that performs the 
enter event, and in which the event fields comprise a 
unique i.d., causality i.d., and correlation i.d. 

29. The method as recited in claim 28 in which the event 
fields further comprise source component, source 65 
machine, source process, and source session for the 
source entity. 
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30. The method as recited in claim 28 in which the event 
fields further comprise target component, target machine, 
target process, and target session for the target entity. 

31. The method recited in claim 21 wherein the data 
processing system further comprises an in-process event 
creator, the method further comprising the step of: 

the in-process event creator monitoring events and send- 
ing them to the event concentrator. 

32. The method recited in claim 21 wherein the data 
processing system further comprises a dynamic event 
creator, the method further comprising the step of: 

the dynamic event creator monitoring time-based events 
and sending them to the event concentrator. 

33. The method recited in claim 21 in which the data 
processing system is a distributed system comprising a 
plurality of machines, and entities reside on different 
machines. 

34. The method recited in claim 21, wherein the steps recited 
therein can be performed in any suitable order. 

35. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 21. 

Triggers 

1. A system for analyzing the performance of a data pro- 
cessing system that produces events, the system compris- 

a control station that analyzes events, the control station 
specifying at least one trigger condition; 

an event concentrator, coupled to the control station, that 
collects events and watches for the at least one trigger 
condition; and 

a store, coupled to the event concentrator, that stores 
events until the occurrence of the at least one trigger 
condition, whereupon the stored events are sent to the 
control station. 

2. The system recited in claim 1, wherein the store is a 
circular buffer, and the control station specifies how many 
events to store in the buffer. 

3. The system recited in claim 1, wherein the control station 
specifies two trigger conditions and a Boolean relation 
between them. 

4. The system recited in claim 1, wherein when the at least 
one trigger condition occurs, the event concentrator 
begins collecting events at a higher rate. 

5. The system recited in claim 1, wherein the control station 
specifies a reset condition, and wherein the event concen- 
trator watches for a reset condition. 

6. The system recited in claim 5, wherein events are no 
longer sent to the control station when the event concen- 
trator detects a reset condition 

7. The system recited in claim 6, wherein, upon occurrence 
of a reset condition, the event concentrator reverts to 
collecting events and watching for the at least one trigger 
condition. 

8. The system recited in claim 1, wherein the store utilizes 
data compression for storing the events. 

9. The system-recited in claim 1, wherein the system utilizes 
data compression for sending the events. 

10. A system for analyzing the performance of a data 
processing system that produces events, the system com- 
prising: 

a control station that analyzes events, the control station 
specifying at least one trigger condition, and the control 
station further specifying at least one filter; 

an event concentrator, coupled to the control station, that 
collects events in accordance with the at least one filter 
and that watches for the at least one trigger condition; 
and 
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a store, coupled to the event concentrator, that stores 
events until the occurrence of the at least one trigger 
condition, whereupon the stored events are sent to the 
control station. 

11. The system recited in claim 10, wherein the store is a 
circular buffer, and the control station specifies how many 
events to store in the buffer. 

12. The system recited in claim 10, wherein the control 
station specifies two trigger conditions and a Boolean 
relation between them. 

13. The method recited in claim 10, wherein when the at 
least one trigger condition occurs, the event concentrator 
begins collecting events at a higher rate. 

14. The system recited in claim 10, wherein the control 
station specifies a reset condition, and wherein the event 
concentrator watches for a reset condition. 

15. The system recited in claim 14, wherein events are no 
longer sent to the control station when the event concen- 
trator detects a reset condition. 

16. The system recited in claim IS, wherein, upon occur- 
rence of a reset condition, the event concentrator reverts 
to collecting events and watching for the at least one 
trigger condition. 

17. The system recited in claim 10, wherein the control 
station specifies a first filter and a second filter, and the 
event concentrator collects events in accordance with the 
first filter until the at least one trigger condition occurs, 
whereupon the event concentrator collects events in 
accordance with the second filter. 

18. The system recited in claim 17, wherein the control 
station specifies a reset condition, and wherein the event 
concentrator watches for a reset condition. 

19. The system recited in claim IS, wherein events are no 
longer sent to the control station when the event concen- 
trator detects a reset condition. 

20. The system recited in claim 18, wherein, upon occur- 
rence of a reset condition, the event concentrator reverts 
to collecting events in accordance with the first filter and 
watching for the at least one trigger condition. 

21. A method for analyzing the performance of a data 
processing system that produces events and that com- 
prises a control station that analyzes events, an event 
concentrator, and a store, the method comprising the steps 
of: 

the control station specifying at least one trigger condi- 
tion; 

the event concentrator collecting events while watching 
for the at least one trigger condition; 

the store storing events collected by the event concentra- 
tor; and 

the event concentrator sending the stored events to the 
control station upon the occurrence of the at least one 
trigger condition. 

22. The method recited in claim 21, wherein the store is a 
circular buffer, and the control station specifies how many 
events to store in the buffer. 

23. The method recited in claim 21, wherein the control 
station specifies two trigger conditions and a Boolean 
relationship between them. 

24. The method recited in claim 21, wherein when the at 
least one trigger condition occurs, the event concentrator 
begins collecting events at a higher rate. 

25. The method recited in claim 21, wherein when the at 
least one trigger condition occurs, the event concentrator 
begins sending events to the control station. 

26. The method recited in claim 21, wherein the control 
station specifies a reset condition, and wherein the event 
concentrator watches for a reset condition. 
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27. The method recited in claim 26, wherein the event 
concentrator no longer sends events to the control station 
when the event concentrator detects a reset condition. 

28. The method recited in claim 26, wherein, upon occur- 
5 rence of a reset condition, the event concentrator reverts 

to collecting events and watching for the at least one 
trigger condition. 

29. The method recited in claim 21, wherein the store 
utilizes data compression for storing the events. 

1Q 30. The method recited in claim 21, wherein the event 
concentrator utilizes data compression for sending the 
events. 

31. The method recited in claim 21, wherein the steps recited 
therein can be performed in any suitable order. 

32. A computer-readable medium having computer- 
15 executable instructions for performing the steps recited in 

claim 21. 

33. A method for analyzing the performance of a data 
processing system that produces events and that com- 
prises a control station that analyzes events, an event 

20 concentrator, and a store, the method comprising the steps 
of: 

the control station specifying at least one trigger condition 
and at least one filter; 

the event concentrator collecting events in accordance 
25 with the at least one filter while watching for the at least 
one trigger condition; 

the store storing events collected by the event concentra- 
tor; and 

the event concentrator sending the stored events to the 
30 control station upon the occurrence of the at least one 
trigger condition. 

34. The method recited in claim 33, wherein the store is a 
circular buffer, and the control station specifies how many 
events to store in the buffer. 

35 35. The method recited in claim 33, wherein the control 
station specifies two trigger conditions and a Boolean 
relationship between them. 

36. The method recited in claim 33, wherein when the at 
least one trigger condition occurs, the event concentrator 

40 begins collecting events at a higher rate. 

37. The method recited in claim 33, wherein when the at 
least one trigger condition occurs, the event concentrator 
begins sending events to the control station. 

38. The method recited in claim 33, wherein the control 
45 station specifies a reset condition, and wherein the event 

concentrator watches for a reset condition. 

39. The method recited in claim 38, wherein the event 
concentrator no longer sends events to the control station 
when the event concentrator detects a reset condition. 

50 40. The method recited in claim 38, wherein, upon occur- 
rence of a reset condition, the event concentrator reverts 
to collecting events and watching for the at least one 
trigger condition. 

41. The method recited in claim 33, wherein the control 
55 station specifies a first filter and a second filter, and further 

comprising the steps of: 

the event concentrator collecting events in accordance 
with the first filter until the at least one trigger condition 
occurs; 

60 then the event concentrator collecting events in accor- 
dance with the second filter. 

42. The method recited in claim 41, wherein the control 
station specifies a reset condition, and wherein the event 
concentrator watches for the reset condition. 

65 43. The method recited in claim 42, wherein the event 
concentrator no longer sends events to the control station 
when the event concentrator detects a reset condition. 
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44. The method recited in claim 42, and further comprising 
the step of: 

upon occurrence of a reset condition, the event concen- 
trator reverts to collecting events in accordance with 
the first filter and watching for the at least one trigger s 
condition. 

45. The method recited in claim 33, wherein the steps recited 
therein can be performed in any suitable order. 

46. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 1Q 
claim 33. 

Filter Reduction 

1. A system for analyzing the performance of a data pro- 
cessing system that produces events, the system compris- 
ing: 

a control station that analyzes events, the control station 

specifying at least one filter; and 
an event concentrator, coupled to the control station, that 

collects events in accordance with the filter. 

2. The system recited in claim 1, wherein the at least one 
filter is a Boolean expression comprising two or more 20 
events. 

3. The system recited in claim 2, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 
ables. 25 

4. The system recited in claim 1, wherein the data processing 
system further comprises a plurality of machines, and 
wherein the at least one filter is directed to at least one of 
the machines. 

5. The system recited in claim 4, wherein the at least one 30 
filter is a Boolean expression comprising two or more 
machines. 

6. The system recited in claim 5, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 35 

7. The system recited in claim 1, wherein the data processing 
system further comprises a plurality of machines, each 
running a plurality of processes, and wherein the at least 
one filter is directed to at least one of the processes. 

8. The system recited in claim 7, wherein the at least one 40 
filter is a Boolean expression comprising two or more 
processes. 

9. The system recited in claim 8, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 45 

10. The system recited in claim 1, wherein the data process- 
ing system further comprises a plurality of machines, each 
running a plurality of processes, and wherein the at least 
one filter is directed to at least one of the machines and at 50 
least one of the processes. 

11. The system recited in claim 10, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

12. The system recited in claim 11, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 

13. The system recited in claim 1, wherein the at least one 
filter designates one or more machines, processes, 
in-process event creators, dynamic event creators, events, 60 
or threads. 

14. The system recited in claim 1, wherein the at least one 
filter is a Boolean expression. 

15. The system recited in claim 14, wherein the Boolean 
expression comprises a set of variables and the Boolean 65 
expression is modified by binding a subset of the vari- 
ables. 



16. The system recited in claim 1, wherein the at least one 
filter is sent to one or more specific machines, processes, 
in-process event creators, dynamic event creators, events, 
or threads. 

17. The system recited in claim 1, wherein the at least one 
filter is broadcast throughout the data processing system, 
wherein it is applied by one or more specific machines, 
processes, in-process event creators, dynamic event 
creators, events, or threads. 

18. A system for analyzing the performance of a data 
processing system that produces events, the system com- 
prising: 

a control station that analyzes events, the control station 

specifying at least one filter; 
an in-process event creator that collects events generated 

by a first data source within the data processing system; 
a dynamic event creator that collects events that are 

generated on a time basis by a second data source 

within the data processing system; and 
an event concentrator collecting events from the 

in-process event creator and from the dynamic event 

creator, in accordance with the at least one filter, and 

sending the events to the control station. 

19. The system recited in claim 18, wherein the at least one 
filter is a Boolean expression comprising two or more 
events. 

20. The system recited in claim 19, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 
ables. 

21. The system recited in claim 18, wherein the data 
processing system further comprises a plurality of 
machines, and wherein the at least one filter is directed to 
at least one of the machines. 

22. The system recited in claim 21, wherein the at least one 
filter is a Boolean expression comprising two or more 
machines. 

23. The system recited in claim 22, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 
ables. 

24. The system recited in claim 18, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
wherein the at least one filter is directed to at least one of 
the processes. 

25. The system recited in claim 24, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

26. The system recited in claim 25, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 

27. The system recited in claim 18, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
wherein the at least one filter is directed to at least one of 
the machines and at least one of the processes. 

28. The system recited in claim 27, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

29. The system recited in claim 28, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 
ables. 

30. The system recited in claim 18, wherein the at least one 
filter designates one or more machines, processes, 
in-process event creators, dynamic event creators, events, 
or threads. 
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31. The system recited in claim 18, wherein the at least one 
filter is a Boolean expression. 

32. The system recited in claim 31, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the van- : 
ables. 

33. The system recited in claim 18, wherein the at least one 
filter is sent to one or more specific machines, processes, 
in-process event creators, dynamic event creators, events, 
or threads. 1 

34. The system recited in claim 18, wherein the at least one 
filter is broadcast throughout the data processing system, 
wherein it is applied by one or more specific machines, 
processes, in-process event creators, dynamic event 
creators, events, or threads. 1 

35. A method for analyzing the performance of a data 
processing system that produces events and that com- 
prises a control station that analyzes events, and an event 
concentrator, the method comprising the steps of: 

the control station specifying at least one filter; and 2 
the event concentrator collecting events in accordance 
with the filter. 

36. The method recited in claim 35, in which in the speci- 
fying step the at least one filter is a Boolean expression 
comprising two or more events. 

37. The method recited in claim 36, in which in the speci- 
fying step the Boolean expression comprises a set of 
variables, and further comprising the step of: 
modifying the Boolean expression by binding a subset of 

the variables. 

38. The method recited in claim 35, wherein the data 
processing system further comprises a plurality of 
machines, and further comprising the step of: 

directing the at least one filter to at least one of the 3 
machines. 

39. The method recited in claim 38, wherein the at least one 
filter is a Boolean expression comprising two or more 



40. The method recited in claim 39, in which in the sped- 4Q 
fying step the Boolean expression comprises a set of 
variables, and further comprising the step of: 
modifying the Boolean expression by binding a subset of 

the variables. 

41. The method recited in claim 35, wherein the data 4S 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
further comprising the step of: 

directing the at least one filter to at least one of the 
processes. 50 

42. The method recited in claim 41, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

43. The method recited in claim 42, in which in the speci- 
fying step the Boolean expression comprises a set of 55 
variables, and further comprising the step of: 
modifying the Boolean expression by binding a subset of 

the variables. 

44. The method recited in claim 35, wherein the data 
processing system further comprises a plurality of 60 
machines, each running a plurality of processes, and 
further comprising the step of 

directing the at least one filter to at least one of the 
machines and at least one of the processes. 

45. The method recited in claim 44, wherein the at least one 65 
filter is a Boolean expression comprising two or more 



46. The method recited in claim 45, in which in the speci- 
fying step the Boolean expression comprises a set of 
variables, and farther comprising the step of: 
modifying the Boolean expression by binding a subset of 

the variables. 

47. The method recited in claim 35, wherein the at least one 
filter designates one or more machines, processes, 
in-process event creators, dynamic event creators, events, 
or threads. 

48. The method recited in claim 35, wherein the at least one 
filter is a Boolean expression. 

49. The method recited in claim 48, in which in the speci- 
fying step the Boolean expression comprises a set of 
variables, and further comprising the step of: 
modifying the Boolean expression by binding a subset of 

the variables. 

50. The method recited in claim 35, and further comprising 
the step of: 

sending the at least one filter to one or more specific 
machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

51. The method recited in claim 35, and further comprising 
the step of: 

broadcasting the at least one filter throughout the data 
processing system, wherein it is applied by one or more 
specific machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

52. The method recited in claim 35, wherein the steps recited 
therein can be performed in any suitable order. 

53. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 35. 

54. A method for analyzing the performance of a data 
processing system that comprises an in-process event 
creator that collects events generated by a first data source 
within the data processing system, a dynamic event cre- 
ator that collects events that are generated on a time basis 
by a second data source within the data processing 
system, an event concentrator, and a control station that 
analyzes events, the method comprising the steps of: 
the control station specifying at least one filter; 

the in-process event creator and the dynamic event creator 
each collecting events in accordance with the at least 
one filter; and 

the event concentrator collecting events from the 
in-process event creator and from the dynamic event 
creator in accordance with the at least one filter, and 
sending the events to the control station. 

55. The method recited in claim 54, in which in the speci- 
fying step the at least one filter is a Boolean expression 
comprising two or more events. 

56. The method recited in claim 55, in which in the speci- 
fying step the Boolean expression comprises a set of 
variables, and further comprising the step of: 
modifying the Boolean expression by binding a subset of 

the variables. 

57. The method recited in claim 54, wherein the data 
processing system further comprises a plurality of 
machines, and further comprising the step of: 
directing the at least one filter to at least one of the 

machines. 

58. The method recited in claim 57, in which in the speci- 
fying step the at least one filter is a Boolean expression 



59. The method recited in claim 58, in which in the speci- 
fying step the Boolean expression comprises a set of 
variables, and further comprising the step of: 
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modifying the Boolean expression by binding a subset of 
the variables. 

60. The method recited in claim 54, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 5 
further comprising the step of: 

directing the at least one filter to at least one of the 
processes. 

61. The method recited in claim 60, in which in the speci- 
fying step the at least one filter is a Boolean expression i° 
comprising two or more processes. 

62. The method recited in claim 61, in which in the speci- 
fying step the Boolean expression comprises a set of 
variables, and further comprising the step of: 
modifying the Boolean expression by binding a subset of 15 

the variables. 

63. The method recited in claim 54, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
further comprising the step of 20 
directing the at least one filter to at least one of the 

machines and at least one of the processes. 

64. The method recited in claim 63, in which in the speci- 
fying step the at least one filter is a Boolean expression 
comprising two or more processes. 

65. The method recited in claim 64, in which in the speci- 
fying step the Boolean expression comprises a set of 
variables, and further comprising the step of: 
modifying the Boolean expression by binding a subset of 3Q 

the variables. 

66. The method recited in claim 54, in which in the speci- 
fying step the at least one filter designates one or more 
machines, processes, in-process event creators, dynamic 
event creators, events, or threads. 3J 

67. The method recited in claim 54, in which in the speci- 
fying step the at least one filter is a Boolean expression. 

68. ['he method recited in claim 67, in which in the speci- 
fying step the Boolean expression comprises a set of 
variables, and further comprising the step of: 4Q 
modifying the Boolean expression by binding a subset of 

the variables. 

69. The method recited in claim 54, and further comprising 
the step of: 

sending the at least one filter to one or more specific 45 
machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

70. The method recited in claim 54, and further comprising 
the step of: 

broadcasting the at least one filter throughout the data 50 
processing system, wherein it is applied by one or more 
specific machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

71. The method recited in claim 54, wherein the steps recited 
therein can be performed in any suitable order. 55 

72. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 54. 

73. In a system for analyzing the performance of a data 
processing system which produces events, and which has 60 
a graphical user interface including a display and a user 
interface selection device, a method of providing and 
selecting a filter from a menu on the display, the filter 
specifying which events to monitor, the method compris- 
ing the steps of: 65 
displaying a menu on the display listing one or more items 

from a group comprising items representing event- 
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generating machines, items representing event- 
generating components, and, items representing cat- 
egories of events within the data processing system; 
receiving a menu entry selection signal indicative of the 
user interface selection device selecting one of the 
items to monitor; and 
repeating the immediately previous step, as necessary, 
until all desired items have been selected. 

74. The method recited in claim 73, in which the at least one 
filter is a Boolean expression comprising two or more 
events. 

75. The method recited in claim 74, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

76. The method recited in claim 73, wherein the data 
processing system further comprises a plurality of 
machines, and further comprising the step of: 
directing the at least one filter to at least one of the 

machines. 

77. The method recited in claim 76, wherein the at least one 
filter is a Boolean expression comprising two or more 
machines. 

78. The method recited in claim 77, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

79. The method recited in claim 73, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
further comprising the step of: 

directing the at least one filter to at least one of the 
processes. 

80. The method recited in claim 79, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

81. The method recited in claim 80, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

82. The method recited in claim 73, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
further comprising the step of 

directing the at least one filter to at least one of the 
machines and at least one of the processes. 

83. The method recited in claim 82, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

84. The method recited in claim 83, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

85. The method recited in claim 73, wherein the at least one 
filter designates one or more machines, processes, 
in-process event creators, dynamic event creators, events, 
or threads. 

86. The method recited in claim 73, wherein the at least one 
filter is a Boolean expression. 

87. The method recited in claim 86, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 
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modifying the Boolean expression by binding a subset of 
the variables. 

88. The method recited in claim 73, and further comprising 
the step of: 

sending the at least one filter to one or more specific 5 
machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

89. The method recited in claim 73, and further comprising 
the step of: 

broadcasting the at least one filter throughout the data 10 
processing system, wherein it is applied by one or more 
specific machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

90. The method recited in claim 73, wherein the steps recited 
therein can be performed in any suitable order. 

91. A computer-readable medium having computer- 1 
executable instructions for performing the steps recited in 
claim 73. 

Filter Combination 

1. A system for analyzing the performance of a data pro- 
cessing system that produces events, the system compris- 20 
ing: 

at least one control station that analyzes events, the at 

least one control station specifying more than one filter 

to monitor events, or threads; and 
an event concentrator, coupled to the at least one control 25 

station, that combines the filters and collects events in 

accordance with the filters. 

2. The system recited in claim 1, wherein at least one of the 
filters is a Boolean expression comprising two or more 
events. 30 

3. The system recited in claim 2, wherein the at least one 
Boolean expression comprises a set of variables and the at 
least one Boolean expression is modified by binding a 
subset of the variables. 

4. The system recited in claim 1, wherein the data processing 3S 
system further comprises a plurality of machines, and 
wherein at least one filter is directed to at least one of the 
machines. 

5. The system recited in claim 4, wherein the at least one 
filter is a Boolean expression comprising two or more 
machines. 40 

6. The system recited in claim 5, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 
ables. 

7. The system recited in claim 1, wherein the data processing 45 
system further comprises a plurality of machines, each 
running a plurality of processes, and wherein at least one 
filter is directed to at least one of the processes. 

8. The system recited in claim 7, wherein the at least one 
filter is a Boolean expression comprising two or more 50 
processes. 

9. The system recited in claim 8, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 
ables. 55 

10. The system recited in claim 1, wherein the data process- 
ing system further comprises a plurality of machines, each 
running a plurality of processes, and wherein at least one 
filter is directed to at least one of the machines and at least 
one of the processes. 60 

11. The system recited in claim 10, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

12. The system recited in claim 11, wherein the Boolean 
expression comprises a set of variables and the Boolean 65 
expression is modified by binding a subset of the vari- 
ables. 
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13. The system recited in claim 1, wherein at least one filter 
designates one or more machines, processes, in-process 
event creators, dynamic event creators, events, or threads. 

14. The system recited in claim 1, wherein at least one filter 
is a Boolean expression. 

15. The system recited in claim 14, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 

16. The system recited in claim 1, wherein at least one filter 
is sent to one or more specific machines, processes, 
in-process event creators, dynamic event creators, events, 
or threads. 

17. The system recited in claim 1, wherein at least one filter 
is broadcast throughout the data processing system, 
wherein it is applied by one or more specific machines, 
processes, in-process event creators, dynamic event 
creators, events, or threads. 

18. The system recited in claim 1 and comprising two or 
more control stations, and wherein the event concentrator 
collects events from a plurality of sources within the data 
processing system and routes them to the respective 
control stations which specified that the events be moni- 

19. A system for analyzing the performance of a data 
processing system that produces events, the system com- 
prising: 

at least one control station that analyzes events, the at 
least one control station specifying more than one filter; 

an in-process event creator that collects events generated 
by a first data source within the data processing system; 

a dynamic event creator that collects events that are 
generated on a time basis by a second data source 
within the data processing system; and 

an event concentrator collecting events from the 
in-process event creator and from the dynamic event 
creator, in accordance with a combination of the filters, 
and sending the events to the control station. 

20. The system recited in claim 19, wherein at least one of 
the filters is a Boolean expression comprising two or more 

21. The system recited in claim 20, wherein the at least one 
Boolean expression comprises a set of variables and the at 
least one Boolean expression is modified by binding a 
subset of the variables. 

22. The system recited in claim 19, wherein the data 
processing system further comprises a plurality of 
machines, and wherein at least one filter is directed to at 
least one of the machines. 

23. The system recited in claim 22, wherein the at least one 
filter is a Boolean expression comprising two or more 
machines. 

24. The system recited in claim 23, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 
ables. 

25. The system recited in claim 19, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
wherein at least one filter is directed to at least one of the 
processes. 

26. The system recited in claim 25, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

27. The system recited in claim 26, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 
ables. 
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28. The system recited in claim 19, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
wherein at least one filter is directed to at least one of the 
machines and at least one of the processes. ; 

29. The system recited in claim 28, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

30. The system recited in claim 29, wherein the Boolean 
expression comprises a set of variables and the Boolean l 
expression is modified by binding a subset of the vari- 

31. The system recited in claim 19, wherein at least one filter 
designates one or more machines, processes, in-process 
event creators, dynamic event creators, events, or threads, i 

32. The system recited in claim 19, wherein at least one filter 
is a Boolean expression. 

33. The system recited in claim 32, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 2 

34. The system recited in claim 19, wherein at least one filter 
is sent to one or more specific machines, processes, 
in-process event creators, dynamic event creators, events, 
or threads. 2 

35. The system recited in claim 19, wherein at least one filter 
is broadcast throughout the data processing system, 
wherein it is applied by one or more specific machines, 
processes, in-process event creators, dynamic event 
creators, events, or threads. 3 

36. The system recited in claim 19 and comprising two or 
more control stations, and wherein the event concentrator 
collects events from a plurality of sources within the data 
processing system and routes them to the respective 
control stations which specified that the events be moni- 3 
tored. 

37. A method for analyzing the performance of a data 
processing system that produces events and that com- 
prises at least one control station that analyzes events, and 
an event concentrator, the method comprising the steps of: 4 
the at least one control station specifying more than one 

filter; and 

the event concentrator combining the filters and collecting 
events in accordance with the combined filter. 

38. The method recited in claim 37, in which at least one 4 
filter is a Boolean expression comprising two or more 

39. The method recited in claim 38, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

40. The method recited in claim 37, wherein the data 
processing system further comprises a plurality of 
machines, and further comprising the step of: 
directing at least one filter to at least one of the machines. 

41. The method recited in claim 40, wherein the at least one 
filter is a Boolean expression comprising two or more 
machines. 6 

42. The method recited in claim 41, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 6 

43. The method recited in claim 37, wherein the data 
processing system further comprises a plurality of 



machines, each running a plurality of processes, and 
further comprising the step of: 

directing at least one filter to at least one of the processes. 
44. The method recited in claim 43, wherein the at least one 
filter is a Boolean expression comprising two or more 



45. The method recited in claim 44, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

46. The method recited in claim 37, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
further comprising the step of 

directing at least one filter to at least one of the machines 
and at least one of the processes. 

47. The method recited in claim 46, wherein the at least one 
filter is a Boolean expression comprising two or more 



. 48. The method recited in claim 47, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

49. The method recited in claim 37, wherein at least one 
filter designates one or more machines, processes, 
in-process event creators, dynamic event creators, events, 
or threads. 

50. The method recited in claim 37, wherein at least one 
filter is a Boolean expression. 

51. The method recited in claim 50, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

52. The method recited in claim 37, and further comprising 
the step of: 

sending at least one filter to one or more specific 
machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

53. The method recited in claim 37, and further comprising 
the step of: 

broadcasting at least one filter throughout the data pro- 
cessing system, wherein it is applied by one or more 
specific machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

54. The method recited in claim 37, wherein the data 
processing system comprises two or more control stations 
each specifying more than one filter, and further compris- 
ing the steps of: 

the event concentrator collecting events from a plurality 
of sources within the data processing system; and 

the event concentrator routing the events to the respective 
control stations which specified that the events be 

55. The method recited in claim 37, wherein the steps recited 
therein can be performed in any suitable order. 

56. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 37. 

57. A method for analyzing the performance of a data 
processing system that comprises an in-process event 
creator that collects events generated by a first data source 
within the data processing system, a dynamic event cre- 
ator that collects events that are generated on a time basis 
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by a second data source within the data processing 
system, an event concentrator, and at least one control 
station that analyzes events, the method comprising the 
steps of: 

the at least one control station specifying more than one 5 
filter; 

the event concentrator combining the filters into a com- 
bined filter; 

the in-process event creator and the dynamic event creator 
each collecting events in accordance with the combined 10 
filter; and 

the event concentrator collecting events from the 
in-process event creator and from the dynamic event 
creator, in accordance with the combined filter, and 
sending the events to the at least one control station. 15 

58. The method recited in claim 57, in which at least one 
filter is a Boolean expression comprising two or more 
events. 

59. The method recited in claim 58, in which the Boolean 
expression comprises a set of variables, and further com- 20 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

60. The method recited in claim 57, wherein the data 
processing system further comprises a plurality of 25 
machines, and further comprising the step of: 
directing at least one filter to at least one of the machines. 

61. The method recited in claim 60, wherein the at least one 
filter is a Boolean expression comprising two or more 3Q 
machines. 

62. The method recited in claim 61, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 15 
the variables. 

63. The method recited in claim 57, wherein the data 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
further comprising the step of: 40 
directing at least one filter to at least one of the processes. 

64. The method recited in claim 63, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

65. The method recited in claim 64, in which the Boolean 45 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

66. The method recited in claim 57, wherein the data 50 
processing system further comprises a plurality of 
machines, each running a plurality of processes, and 
further comprising the step of 

directing at least one filter to at least one of the machines 
and at least one of the processes. ss 

67. The method recited in claim 66, wherein the at least one 
filter is a Boolean expression comprising two or more 
processes. 

68. The method recited in claim 67, in which the Boolean 
expression comprises a set of variables, and further com- 60 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

69. The method recited in claim 57, wherein at least one 
filter designates one or more machines, processes, 65 
in-process event creators, dynamic event creators, events, 
or threads. 



052 Bl 

58 

70. The method recited in claim 57, wherein at least one 
filter is a Boolean expression. 

71. The method recited in claim 70, in which the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

72. The method recited in claim 57, and further comprising 
the step of: 

sending at least one filter to one or more specific 
machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

73. The method recited in claim 57, and further comprising 
the step of: 

broadcasting at least one filter throughout the data pro- 
cessing system, wherein it is applied by one or more 
specific machines, processes, in-process event creators, 
dynamic event creators, events, or threads. 

74. The method recited in claim 57, wherein the data 
processing system comprises two or more control stations 
each specifying more than one filter, and further compris- 
ing the steps of: 

the event concentrator collecting events from a plurality 
of sources within the data processing system; and 

the event concentrator routing the events to the respective 
control stations which specified that the events be 
monitored. 

75. The method recited in claim 57, wherein the steps recited 
therein can be performed in any suitable order. 

76. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 57. 

Filter Specification 

1. A system for analyzing the performance of a data pro- 
cessing system that produces events, the system compris- 
ing: 

a control station that analyzes events, the control station 
providing a graphical user interface for enabling a user 
to specify at least one filter; and 

an event concentrator, coupled to the control station, that 
collects events in accordance with the at least one filter. 

2. The system recited in claim 1, wherein the at least one 
filter is a Boolean expression comprising two or more 
events. 

3. The system recited in claim 2, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 

4. The system recited in claim 1, wherein the graphical user 
interface comprises a window for displaying a textual 
representation of the at least one filter. 

5. The system recited in claim 4, wherein the at least one 
filter can be entered by the user as text. 

6. The system recited in claim 1, wherein the control station 
analyzes events collected by the event concentrator as the 
events are collected. 

7. The system recited in claim 1, wherein the control station 
analyzes events collected by the event concentrator after 
the events are collected. 

8. The system recited in claim 1, wherein the graphical user 
interface comprises a pre-defined list of filters from which 
a user can specify at least one filter. 

9. The system recited in claim 1, wherein the at least one 
filter is a debug or trace switch. 

10. A system for analyzing the performance of a data 
processing system that produces events, the system com- 
prising: 
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a control station that analyzes events, the control station 

providing a graphical user interface for enabling a user 

to specify at least one filter; 
an in-process event creator that collects events generated 

by a first data source within the data processing system; 5 
a dynamic event creator that collects events that are 

generated on a time basis by a second data source 

within the data processing system; and 
an event concentrator collecting events from the 

in-process event creator and from the dynamic event 10 

creator, in accordance with the at least one filter, and 

sending the events to the control station. 

11. The system recited in claim 10, wherein the at least one 
filter is a Boolean expression comprising two or more 
events. 15 

12. The system recited in claim 11, wherein the Boolean 
expression comprises a set of variables and the Boolean 
expression is modified by binding a subset of the vari- 
ables. 

13. The system recited in claim 10, wherein the graphical 2Q 
user interface comprises a window for displaying a textual 
representation of the at least one filter. 

14. The system recited in claim 13, wherein the at least one 
filter can be entered by the user as text. 

15. The system recited in claim 10, wherein the control 
station analyzes events collected by the event concentra- 
tor as the events are collected. 

16. The system recited in claim 10, wherein the control 
station analyzes events collected by the event concentra- 
tor after the events are collected. 3Q 

17. The system recited in claim 10, wherein the graphical 
user interface comprises a pre-defined list of filters from 
which a user can specify at least one filter. 

18. The system recited in claim 10, wherein the at least one 
filter is a debug or trace switch. 3J 

19. A method for analyzing the performance of a data 
processing system that produces events and that com- 
prises a control station that analyzes events, and an event 
concentrator, the method comprising the steps of: 

the control station providing a graphical user interface for 4Q 
enabling a user to specify at least one filter; and 

the event concentrator collecting events in accordance 
with the filter. 

20. The method recited in claim 19, wherein the at least one 
filter is a Boolean expression comprising two or more 45 

21. The method recited in claim 20, wherein the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 50 
the variables. 

22. The method recited in claim 19, wherein the graphical 
user interface comprises a window, and further compris- 
ing the step of: 

displaying a textual representation of the at least one filter 55 
in the window. 

23. The method recited in claim 19, and further comprising 
the step of: 

providing a window as part of the graphical user interface, 
into which window the at least one filter can be entered 60 
by the user as text. 

24. The method recited in claim 19, wherein the control 
station analyzes events collected by the event concentra- 
tor as the events are collected. 

25. The method recited in claim 19, wherein the control 65 
station analyzes events collected by the event concentra- 
tor after the events are collected. 
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26. The method recited in claim 19, wherein the graphical 
user interface comprises a pre-defined list of filters from 
which a user can specify at least one filter. 

27. The method recited in claim 19, wherein the at least one 
filter is a debug or trace switch. 

28. The method recited in claim 19, wherein the steps recited 
therein can be performed in any suitable order. 

29. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 19. 

30. A method for analyzing the performance of a data 
processing system that comprises an in-process event 
creator that collects events generated by a first data source 
within the data processing system, a dynamic event cre- 
ator that collects events that are generated on a time basis 
by a second data source within the data processing 
system, an event concentrator, and a control station that 
analyzes events, the method comprising the steps of: 
the control station providing a graphical user interface for 

enabling a user to specify at least one filter; 
the in-process event creator and the dynamic event creator 
each collecting events in accordance with the at least 
one filter; and 

the event concentrator collecting events from the 
in-process event creator and from the dynamic event 
creator, in accordance with the at least one filter, and 
sending the events to the control station. 

3 1 . The method recited in claim 30, wherein the at least one 
filter is a Boolean expression comprising two or more 
events. 

32. The method recited in claim 31, wherein the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

33. The method recited in claim 30, wherein the graphical 
user interface comprises a window, and further compris- 
ing the step of: 

displaying a textual representation of the at least one filter 
in the window. 

34. The method recited in claim 30, and further comprising 
the step of: 

providing a window as part of the graphical user interface, 
into which window the at least one filter can be entered 

35. The method recited in claim 30, wherein the control 
station analyzes events collected by the event concentra- 
tor as the events are collected. 

36. The method recited in claim 30, wherein the control 
station analyzes events collected by the event concentra- 
tor after the events are collected. 

37. The method recited in claim 30, wherein the graphical 
user interface comprises a pre-defined list of filters from 
which a user can specify at least one filter. 

38. The method recited in claim 30, wherein the at least one 
filter is a debug or trace switch. 

39. The method recited in claim 30, wherein the steps recited 
therein can be performed in any suitable order. 

40. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 30. 

41. In a system for analyzing the performance of a data 
processing system which produces events, and which has 
a graphical user interface including a display and a user 
interface selection device, a method of providing and 
selecting a filter from a menu on the display, the filter 
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specifying which events to monitor, the method compris- 
ing the steps of: 

displaying a menu on the display listing items represent- 
ing event-generating machines, event-generating 
components, and categories of events within the data 
processing system; 

receiving a menu entry selection signal indicative of the 
user interface selection device selecting one of the 
items to monitor; and 

repeating the immediately previous step, as necessary, 
until all desired items have been selected. 

42. The method recited in claim 41, wherein the at least one 
filter is a Boolean expression comprising two or more 

43. The method recited in claim 42, wherein the Boolean 
expression comprises a set of variables, and further com- 
prising the step of: 

modifying the Boolean expression by binding a subset of 
the variables. 

44. The method recited in claim 41, wherein the graphical 
user interface comprises a window, and further compris- 
ing the step of: 

displaying a textual representation of the at least one filter 
in the window. 

45. The method recited in claim 41, and further comprising 
the step of: 

providing a window as part of the graphical user interface, 
into which window the at least one filter can be entered 
as text. 

46. The method recited in claim 41, wherein the control 
station analyzes events collected by the event concentra- 
tor as the events are collected. 

47. The method recited in claim 41, wherein the control 
station analyzes events collected by the event concentra- 
tor after the events are collected. 

48. The method recited in claim 41, wherein the graphical 
user interface comprises a pre-defined list of filters from 
which a user can specify at least one filter. 

49. The method recited in claim 41, wherein the items 
representing event-generating components refer to data 
sources on event-generating machines being monitored. 

50. The method recited in claim 41, wherein the at least one 
filter is a debug or trace switch. 

51. The method recited in claim 41, wherein the steps recited 
therein can be performed in any suitable order. 

52. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 41. 

APIs 

1. A computer system comprising: 

a computer comprising a processor and a memory opera- 
tively coupled together; 

an operating system executing in the processor; 

an application program running under the control of the 
operating system, the application program having an 
event-generating component; and 

application program interfaces associated with the event- 
generating component operative to receive data from 
the operating system and send data to the operating 
system. 

2. The computer system of claim 1, wherein the application 
program interfaces comprise: 

a first interface that enables the operating system to set or 
disable a status condition in the application; and 

a second interface that receives a status query from the 
operating system and that returns the status of the status 
condition to the operating system. 
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3. The computer system of claim 1, wherein the application 
program interfaces comprise: 

an interface that enables the operating system to read any 
one or more of several fields in the application, the 

5 fields being from a group comprising arguments, cau- 
sality i.d., correlation i.d., dynamic event data, 
exception, return value, security i.d., source 
component, source handle, source machine, source 
process, source process name, source session, source 

10 thread, target component, target handle, target machine, 
target process, target process name, target session, and 
target thread. 

4. A computer system comprising: 

a computer comprising a processor and a memory opera- 
15 tively coupled together; 

an operating system executing in the processor, the oper- 
ating system having an event-registering component; 
an application program running under the control of the 
operating system; and 
20 application program interfaces associated with the event- 
registering component operative to receive data from 
the operating system and send data to the operating 

5. The computer system of claim 4, wherein the application 
25 program interfaces comprise: 

a first interface that enables the operating system to query 
whether a status condition is set or disabled in the 
application; and 
3Q a second interface that returns data to the operating 
system only if the status condition is set. 

6. The computer system of claim 4, wherein the application 
program interfaces comprise: 

an interface that enables the operating system to read any 
35 one or more of several fields in the application, the 
fields being from a group comprising arguments, cau- 
sality i.d., correlation i.d., dynamic event data, 
exception, return value, security i.d., source 
component, source handle, source machine, source 
40 process, source process name, source session, source 
thread, target component, target handle, target machine, 
target process, target process name, target session, and 
target thread. 

7. A set of application program interfaces embodied on a 
45 computer-readable medium for execution on a computer 

in conjunction with an operating system that interfaces 
with an application program having an event-generating 
component, comprising: 

a first interface that enables the operating system to set or 
50 disable a status condition in the application; and 

a second interface that receives a status query from the 

operating system and that returns the status condition to 

the operating system. 

8. The set of application program interfaces recited in claim 
ss 7 and further comprising: 

a third interface that enables the operating system to read 
any one or more of several fields in the application, the 
fields being from a group comprising arguments, cau- 
sality i.d., correlation i.d., dynamic event data, 

60 exception, return value, security i.d., source 
component, source handle, source machine, source 
process, source process name, source session, source 
thread, target component, target handle, target machine, 
target process, target process name, target session, and 

65 target thread. 

9. A set of application program interfaces embodied on a 
computer-readable medium for execution on a computer 
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in conjunction with an application program that interfaces 
with an operating system having an event-registering 
component, comprising: 

a first interface that enables the operating system to query 
whether a status condition is set or disabled in the 
application; and 

a second interface that returns data to the operating 
system only if the status condition is set. 
10. The set of application program interfaces recited in claim 

9 and further comprising: 

a third interface that enables the operating system to read 
any one or more of several fields in the application, the 
fields being from a group comprising arguments, cau- 
sality i.d., correlation i.d., dynamic event data, 
exception, return value, security i.d., source 
component, source handle, source machine, source 
process, source process name, source session, source 
thread, target component, target handle, target machine, 
target process, target process name, target session, and 
target thread. M 
Animated Application Model 

1. A system for analyzing the performance of a data pro- 
cessing system that produces events, the system compris- 
ing: 

an event concentrator that collects events; and 2S 
a control station, coupled to the event concentrator, that 
analyzes events, the control station displaying a model 
of the functionally active structure of the data process- 
ing system. 

2. The system recited in claim 1, wherein the model is a 30 
hierarchical model. 

3. The system recited in claim 2, wherein the hierarchical 
model includes items from a group comprising machines, 
processes, data sources, entities, and instances. 

4. The system recited in claim 1, wherein the control station 35 
displays items from a group comprising summary data, 
time data, event details, and a call tree simultaneously 
while displaying the model. 

5. The system recited in claim 1, wherein the model is 
updated in real time as the control station analyzes the 40 

6. The system recited in claim 1, wherein the control station 
comprises a graphical user interface including a display 
and a user interface selection device. 

7. The system recited in claim 6, wherein the graphical user 45 
interface employs a video cassette recorder (VCR) para- 
digm to enable a user to analyze the performance of the 
data processing system. 

8. The system recited in claim 7, wherein the VCR paradigm 
displays user-selectable commands from a group com- 50 
prising playing, replaying, stopping, reversing, pausing, 
and changing the speed of the model. 

9. The system recited in claim 1, wherein the control station 
uses heuristics from a group comprising time-ordering, 
causality information, and event handles to display the 55 
model. 

10. The system recited in claim 1, wherein the control station 
displays active portions of the model in a visually dis- 

11. A system for analyzing the performance of a data 60 
processing system that produces events, the system com- 
prising: 

an in-process event creator that collects events generated 
by a first data source within the data processing system; 

a dynamic event creator that collects events that are 65 
generated on a time basis by a second data source 
within the data processing system; 
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an event concentrator collecting events from the 
in-process event creator and from the dynamic event 

a control station, coupled to the event concentrator, that 
; analyzes events, the control station displaying a model 
of the functionally active structure of the data process- 
ing system. 

12. The system recited in claim 11, wherein the model is a 
hierarchical model. 
0 13. The system recited in claim 12, wherein the hierarchical 
model includes items from a group comprising machines, 
processes, data sources, entities, and instances. 

14. The system recited in claim 11, wherein the control 
station displays items from a group comprising summary 
data, time data, event details, and a call tree simulta- 
neously while displaying the model. 

15. The system recited in claim 11, wherein the model is 
updated in real time as the control station analyzes the 

16. The system recited in claim 11, wherein the control 
station comprises a graphical user interface including a 
display and a user interface selection device. 

17. The system recited in claim 16, wherein the graphical 
user interface employs a video cassette recorder (VCR) 

s paradigm to enable a user to analyze the performance of 
the data processing system. 

18. The system recited in claim 17, wherein the VCR 
paradigm displays user-selectable commands from a 
group comprising playing, replaying, stopping, reversing, 
pausing, and changing the speed of the model. 

19. The system recited in claim 11, wherein the control 
station uses heuristics from a group comprising time- 
ordering, causality information, and event handles to 
display the model. 

20. The system recited in claim 11, wherein the control 
station displays active portions of the model in a visually 
distinctive manner. 

21. A method for analyzing and displaying the performance 
of a data processing system that produces events and that 
comprises a control station that analyzes events, and an 
event concentrator, the method comprising the steps of: 
the event concentrator collecting events; and 
the control station displaying a model of the functionally 

active structure of the data processing system. 

22. The method recited in claim 21, wherein the model is a 
hierarchical model. 

23. The method recited in claim 22, wherein the hierarchical 
model includes items from a group comprising machines, 
processes, data sources, entities, and instances. 

24. The method recited in claim 21, and further comprising 
the step of: 

the control station displaying items from a group com- 
prising summary data, time data, event details, and a 
call tree simultaneously while displaying the model. 

25. The method recited in claim 21, and further comprising 
the step of: 

updating the model in real time as the control station 
analyzes the events. 

26. The method recited in claim 21, wherein the control 
station comprises a graphical user interface including a 
display and a user interface selection device. 

27. The method recited in claim 26, wherein the graphical 
user interface employs a video cassette recorder (VCR) 
paradigm to enable a user to analyze the performance of 
the data processing system. 

28. The method recited in claim 27, and further comprising 
the step of: 
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the VCR paradigm displaying user-selectable commands 
from a group comprising playing, replaying, stopping, 
reversing, pausing, and changing the speed of the 

29. The method recited in claim 21, and further comprising 5 
the step of: 

the control station using heuristics from a group compris- 
ing time-ordering, causality information, and event 
handles to generate and display the model. 

30. The method recited in claim 21, and further comprising io 
the step of: 

the control station displaying active portions of the model 
in a visually distinctive manner. 

31. The method recited in claim 21, wherein the steps recited 
therein can be performed in any suitable order. 15 

32. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 21. 

33. A method for analyzing and displaying the performance 

of a data processing system that comprises an in-process 20 
event creator that collects events generated by a first data 
source within the data processing system, a dynamic 
event creator that collects events that are generated on a 
time basis by a second data source within the data 
processing system, an event concentrator, and a control 25 
station that analyzes events, the method comprising the 
steps of: 

the in-process event creator and the dynamic event creator 
each collecting events from their respective data ^ 

the event concentrator collecting events from the 
in-process event creator and from the dynamic event 
creator and sending the events to the control station; 

and 35 
the control slat ion displaying a model of the functionally 
active structure of the data processing system. 

34. The method recited in claim 33, wherein the control 
station provides a user interface for enabling user selec- 
tion of one or more portions of the model. 4Q 

35. The method recited in claim 34, and further comprising 
the step of: 

exploding a portion of the model to show more detail, in 
response to a user selection. 

36. The method recited in claim 34, and further comprising 45 
the step of: 

contracting a portion of the model to show less detail, in 
response to a user selection. 

37. The method recited in claim 33, wherein the model is a 
hierarchical model. 5n 

38. The method recited in claim 37, wherein the hierarchical 
model includes items from a group comprising machines, 
processes, data sources, entities, and instances. 

39. The method recited in claim 33, and further comprising 
the step of: 55 
the control station displaying items from a group com- 
prising summary data, time data, event details, and a 
call tree simultaneously while displaying the model. 

40. The method recited in claim 33, and further comprising 
the step of: 60 
updating the model in real time as the control station 

analyzes the events. 

41. The method recited in claim 33, wherein the control 
station comprises a graphical user interface including a 
display and a user interface selection device. 65 

42. The method recited in claim 41, wherein the graphical 
user interface employs a video cassette recorder (VCR) 
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paradigm to enable a user to analyze the performance of 
the data processing system. 

43. The method recited in claim 42, and further comprising 
the step of: 

the VCR paradigm displaying user-selectable commands 
from a group comprising playing, replaying, stopping, 
reversing, pausing, and changing the speed of the 
model. 

44. The method recited in claim 33, and further comprising 
the step of: 

the control station using heuristics from a group compris- 
ing time-ordering, causality information, and event 
handles to generate and display the model. 

45. The method recited in claim 33, and further comprising 
the step of: 

the control station displaying active portions of the model 
in a visually distinctive manner. 

46. The method recited in claim 33, wherein the steps recited 
therein can be performed in any suitable order. 

47. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 33. 

48. In a system for analyzing the performance of a data 
processing system which produces events, and which has 
a graphical user interface including a display and a user 
interface selection device, a method of providing an 
animated model of the performance of the data processing 
system and enabling user expansion or contraction of 
portions of the animated model, the method comprising 
the steps of: 

displaying the model showing the functionally active 
structure of the data processing system; 

receiving a selection signal indicative of the user interface 
selection device selecting a portion of the animated 
model; 

receiving an expansion or contraction command; and 
performing an expansion or contraction function on the 
selected portion in response to the selection signal and 
the expansion or contraction command, as appropriate. 

49. The method recited in claim 48, wherein the model is a 
hierarchical model. 

50. The method recited in claim 49, wherein the hierarchical 
model includes items from a group comprising machines, 
processes, data sources, entities, and instances. 

51. The method recited in claim 48, and further comprising 
the step of: 

the control station displaying items from a group com- 
prising summary data, time data, event details, and a 
call tree simultaneously while displaying the model. 

52. The method recited in claim 48, and further comprising 
the step of: 

updating the model in real lime as the control station 
analyzes the events. 

53. The method recited in claim 48, wherein the graphical 
user interface employs a video cassette recorder (VCR) 
paradigm to enable a user to analyze the performance of 
the data processing system. 

54. The method recited in claim 53, and further comprising 
the step of: 

the VCR paradigm displaying user-selectable commands 
from a group comprising playing, replaying, stopping, 
reversing, pausing, and changing the speed of the 
model. 

55. The method recited in claim 48, and further comprising 
the step of: 
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the control station using heuristics from a group compris- 
ing time-ordering, causality information, and event 
handles to generate and display the model. 

56. The method recited in claim 48, and further comprising 
the step of: 5 
the control station displaying active portions of the model 

in a visually distinctive manner. 

57. The method recited in claim 48, wherein the steps recited 
therein can be performed in any suitable order. 

58. A computer-readable medium having computer- 1Q 
executable instructions for performing the steps recited in 
claim 48. 

Performance Analysis 

1. A system for analyzing the performance of a data pro- 
cessing system that produces events, the system compris- 
ing: 15 
an event concentrator that collects events; and 

a control station, coupled to the event concentrator, that 
analyzes events, the control station displaying a call 
tree of the functionally active structure of the data 2Q 
processing system. 

2. The system recited in claim 1, wherein the control station 
displays items from a group comprising Gantt style 
charts, process summary data, performance summary 
data, and time data simultaneously while displaying the 
call tree. 

3. The system recited in claim 2, wherein the displayed items 
are time-synchronized. 

4. The system recited in claim 1, wherein the call tree is 
updated in real time as the control station analyzes the 3Q 
events. 

5. The system recited in claim 1, wherein the control station 
analyzes events collected by the event concentrator as the 
events are collected. 

6. The system recited in claim 1, wherein the control station 35 
analyzes events collected by the event concentrator after 
the events are collected. 

7. The system recited in claim 1, wherein the control station 
provides a user interface for enabling user selection of one 

or more portions of the call tree. 4Q 

8. The system recited in claim 7, wherein the control station 
explodes a portion of the call tree to show more detail, in 
response to a user selection. 

9. The system recited in claim 7, wherein the control station 
contracts a portion of the call tree to show less detail, in 45 
response to a user selection. 

10. The system recited in claim 1, wherein the control station 
uses heuristics from a group comprising time-ordering, 
causality information, and event handles to display the 
call tree. 50 

11. A system for analyzing the performance of a data 
processing system that produces events, the system com- 
prising: 

an in-process event creator that collects events generated 
by a first data source within the data processing system; 55 

a dynamic event creator that collects events that are 
generated on a time basis by a second data source 
within the data processing system; 

an event concentrator collecting events from the 
in-process event creator and from the dynamic event 60 
creator, and sending the events to the control station; 

a control station that analyzes events, the control station 
displaying a call tree of the functionally active structure 
of the data processing system. 65 

12. The system recited in claim 12, wherein the control 
station displays items from a group comprising Gantt 



style charts, process summary data, performance sum- 
mary data, and time data simultaneously while displaying 
the call tree. 

13. The system recited in claim 12, wherein the displayed 
items are time-synchronized. 

14. The system recited in claim 11, wherein the call tree is 
updated in real time as the control station analyzes the 

15. The system recited in claim 11, wherein the control 
station analyzes events collected by the event concentra- 
tor as the events are collected. 

16. The system recited in claim 11, wherein the control 
station analyzes events collected by the event concentra- 
tor after the events are collected. 

17. The system recited in claim 11, wherein the control 
station provides a user interface for enabling user selec- 
tion of one or more portions of the call tree. 

18. The system recited in claim 17, wherein the control 
station explodes a portion of the call tree to show more 
detail, in response to a user selection. 

19. The system recited in claim 17, wherein the control 
station contracts a portion of the call tree to show less 
detail, in response to a user selection. 

20. The system recited in claim 11, wherein the control 
station uses heuristics from a group comprising time- 
ordering, causality information, and event handles to 
display the call tree. 

21. A method for analyzing and displaying the performance 
of a data processing system that produces events and that 
comprises a control station that analyzes events, and an 
event concentrator, the method comprising the steps of: 
the event concentrator collecting events; and 

the control station displaying a call tree of the functionally 
active structure of the data processing system. 

22. The method recited in claim 21, and further comprising 
the step of: 

the control station displaying items from a group com- 
prising Gantt style charts, process summary data, per- 
formance summary data, and time data simultaneously 
while displaying the call tree. 

23. The method recited in claim 22, wherein the displayed 
items are time-synchronized. 

24. The method recited in claim 21, and further comprising 
the step of: 

updating the call tree in real time as the control station 
analyzes the events. 

25. The method recited in claim 21, and further comprising 
the step of: 

the control station analyzing events collected by the event 
concentrator as the events are collected. 

26. The method recited in claim 21, and further comprising 
the step of: 



27. The method recited in claim 21, wherein the control 
station comprises a graphical user interface including a 
display and a user interface selection device. 

28. The method recited in claim 27, and further comprising 
the step of: 

exploding a portion of the call tree to show more detail, 
in response to a user selection. 

29. The method recited in claim 27, and further comprising 
the step of: 

contracting .a portion of the call tree to show less detail, 
in response to a user selection. 

30. The method recited in claim 21, and further comprising 
the step of: 
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the control station using heuristics from a group compris- 
ing time-ordering, causality information, and event 
handles to generate and display the call tree. 

31. The method recited in claim 21, wherein the steps recited 
therein can be performed in any suitable order. 5 

32. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 21. 

33. A method for analyzing and displaying the performance 

of a data processing system that comprises an in-process in 
event creator that collects events generated by a first data 
source within the data processing system, a dynamic 
event creator that collects events that are generated on a 
time basis by a second data source within the data 
processing system, an event concentrator, and a control 15 
station that analyzes events, the method comprising the 
steps of: 

the in-process event creator and the dynamic event creator 
each collecting events from their respective data 
sources; 20 

the event concentrator collecting events from the 
in-process event creator and from the dynamic event 
creator and sending the events to the control station; 

the control station displaying a call tree of the functionally 25 
active structure of the data processing system, and 
further providing a user interface for enabling user 
selection of one or more portions of the call tree. 

34. The method recited in claim 33, and further comprising 
the step of: 30 
the control station displaying items from a group com- 
prising Gantt style charts, process summary data, per- 
formance summary data, and time data simultaneously 
while displaying the call tree. 

35. The method recited in claim 34, wherein the displayed 35 
items are time-synchronized. 

36. The method recited in claim 33, and further comprising 
the step of: 

updating the call tree in real time as the control station 4Q 
analyzes the events. 

37. The method recited in claim 33, and further comprising 
the step of: 

the control station analyzing events collected by the event 
concentrator as the events are collected. 4J 

38. The method recited in claim 33, and further comprising 
the step of: 

the control station analyzing events collected by the event 
concentrator after the events are collected. 

39. The method recited in claim 33, wherein the control so 
station comprises a graphical user interface including a 
display and a user interface selection device. 

40. The method recited in claim 39, and further comprising 
the step of: 

exploding a portion of the call tree to show more detail, 55 
in response to a user selection. 

41. The method recited in claim 39, and further comprising 
the step of: 

contracting a portion of the call tree to show less detail, 
in response to a user selection. 60 

42. The method recited in claim 33, and further comprising 
the step of: 

the control station using heuristics from a group compris- 
ing time-ordering, causality information, and event 
handles to generate and display the call tree. 65 

43. The method recited in claim 33, wherein the steps recited 
therein can be performed in any suitable order. 



44. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 33. 

45. In a system for analyzing the performance of a data 
processing system which produces events, and which has 
a graphical user interface including a display and a user 
interface selection device, a method of providing a call 
tree of the performance of the data processing system and 
enabling user expansion or contraction of portions of the 
call tree, the method comprising the steps of: 
displaying the call tree showing the functionally active 

structure of the data processing system; 
receiving a selection signal indicative of the user interface 

selection device selecting a portion of the call tree; 
receiving an expansion or contraction command; and 
performing an expansion or contraction function on the 

selected portion in response to the selection signal and 

the expansion or contraction command, as appropriate. 

46. The method recited in claim 45, and further comprising 
the step of: 

the control station displaying items from a group com- 
prising Gantt style charts, process summary data, per- 
formance summary data, and time data simultaneously 
while displaying the call tree. 

47. The method recited in claim 46, wherein the displayed 
items are time-synchronized. 

48. The method recited in claim 45, and further comprising 
the step of: 

updating the call tree in real time as the control station 
analyzes the events. 

49. The method recited in claim 45, and further comprising 
the step of: 

the control station analyzing events collected by the event 
concentrator as the events are collected. 

50. The method recited in claim 45, and further comprising 
the step of: 



51. The method recited in claim 45, and further comprising 
the step of: 

the control station using heuristics from a group compris- 
ing time-ordering, causality information, and event 
handles to generate and display the call tree. 

52. The method recited in claim 45, wherein the steps recited 
therein can be performed in any suitable order. 

53. A computer-readable medium having computer- 
executable instructions for performing the steps recited in 
claim 45. 

We claim: 

1. A system for analyzing the performance of a data 
processing system comprising: 

a control station adapted to control at least one event 
concentrator to enable monitoring of a process; 

an in-process event creator associated with the monitored 
process, the event creator collecting events generated 
by the monitored process when enabled by the at least 
one event concentrator; and 

the at least one event concentrator, coupled to the control 
station and to the in-process event creator, that collects 
events from the in-process event creator and sends 
them to the control station. 

2. A system as recited in claim 1, wherein the in-process 
event creator is coupled to the control station and can be 
turned on or off by the control station. 

3. The system recited in claim 1, wherein the in-process 
event creator can be created and removed by the control 
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4. A system as recited in claim 1, and further comprising: 
a dynamic event creator, coupled to the event 

concentrator, that collects data that is generated on a 
time basis; 

and wherein the event concentrator collects data from the 
dynamic event creator and sends it to the control 

station. 

5. A system as recited in claim 4, wherein the dynamic 
event creator is coupled to the control station and can be 10 
turned on or off by the control station. 

6. The system recited in claim 4, wherein the dynamic 
event creator can be created and removed by the control 
station. 1S 

7. A system for analyzing the structure and operation of an 
application executing on a data processing system compris- 
ing: 

a control station adapted to control at least one event 
concentrator to enable monitoring of an application; 20 

an in-process event creator associated with the monitored 
application, the event creator collecting events gener- 
ated by the execution of the application when enabled 
by the at least one event concentrator; and 25 

an event concentrator, coupled to the control station and 
to the in-process event creator, that collects events from 
the in-process event creator and sends them to the 
control station. 30 

8. A system as recited in claim 7, wherein the in-process 
event creator is coupled to the control station and can be 
turned on or off by the control station. 

9. The system recited in claim 7, wherein the in-process 
event creator can be created and removed by the control 35 
station. 

10. Asystem as recited in claim 7, and further comprising: 
a dynamic event creator, coupled to the event 

concentrator, that collects data that is generated on a 4Q 
time basis from the execution of the application; 
and wherein the event concentrator collects data from the 
dynamic event creator and sends it to the control 
station. 

11. A system as recited in claim 10, wherein the dynamic 45 
event creator is coupled to the control station and can be 
turned on or off by the control station. 

12. The system recited in claim 10, wherein the dynamic 
event creator can be created and removed by the control J0 
station. 

13. The system as recited in claim 7 wherein the appli- 
cation is executing on two or more data processing systems. 

14. A system for analyzing the performance of a network 
comprising two or more data processing systems compris- 55 
ing: 

a control station adapted to control at least one event 
concentrator to enable monitoring of a process; 

an in-process event creator associated with the monitored 
process, the event creator collecting events generated 
by the monitored process when enabled by the at least 

an event concentrator, coupled to the control station and 
to the in-process event creator, that collects events from 65 
the in-process event creator and sends them to the 
control station. 
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15. Asystem as recited in claim 14, wherein the in-process 
event creator is coupled to the control station and can be 
turned on or off by the control station. 

16. The system recited in claim 14, wherein the in-process 
event creator can be created and removed by the control 
station. 

17. The system recited in claim 14, wherein a separate 
in-process event creator can be created and removed by the 
control station for each data processing system. 

18. Asystem as recited in claim 14, and further compris- 
ing: 

a dynamic event creator, coupled to the event 
concentrator, that collects data that is generated on a 
time basis; 

and wherein the event concentrator collects data from the 
dynamic event creator and sends it to the control 

19. Asystem as recited in claim 18, wherein the dynamic 
event creator is coupled to the control station and can be 
turned on or off by the control station. 

20. The system recited in claim 18, wherein the dynamic 
event creator can be created and removed by the control 
station. 

21. A method of analyzing the performance of a data 
processing system having at least one program module, a 
control station, and an event concentrator, the method com- 
prising the steps of: 

the at least one program module creating an in-process 

the in-process event creator collecting events generated 
by a data source within the data processing system; and 

the event concentrator collecting events from the 
in-process event creator and sending them to the con- 
trol station. 

22. The method recited in claim 21, wherein the control 
station turns the in-process event creator on or off. 

23. The method recited in claim 22, wherein the control 
station turns the in-process event creator on or off by setting 
or disabling a status condition in the in-process event 
creator. 

24. The method recited in claim 21, wherein the event 
concentrator buffers a predetermined quantity of the events 
and only stores the events on request of the control station. 

25. The method recited in claim 21, further comprising the 
steps of: 

the at least one program module creating a dynamic event 
creator; 

the dynamic event creator collecting data that is generated 

on a time basis; and 
the event concentrator collecting data from the dynamic 

event creator and sending it to the control station. 

26. The method recited in claim 25, wherein the control 
station turns the dynamic event creator on or off. 

27. The method recited in claim 26, wherein the event 
concentrator buffers a predetermined quantity of the data 
and only stores the data on request of the control station. 

28. The method recited in claim 21, wherein the at least 
one program module first creates an in-process event creator 
reference in the creating step, and further comprising the 
step of: a local event concentrator converting the in-process 
event creator reference to an in-process event creator. 
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29. The method recited in claim 21, wherein the control 
station creates the event concentrator. 

30. The method recited in claim 21, wherein the steps 
recited therein can be performed in any suitable order. 

31. A computer-readable medium having computer- : 
executable instructions for analyzing the performance of a 
data processing system having at least one program module, 

a control station and an event concentrator, the computer- 
executable instructions performing the steps comprising: 
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e program module creating a 



.he at least one program module creating an in-process 
event creator; 

:he in-process event creator collecting events generated 
by a data source within the data processing system; and 

:he event concentrator collecting events from the 
in-process event creator and sending them to the con- 
trol station. 
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